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EYSENCK’S TREATMENT OF THE PERSONALITY 
OF COMMUNISTS 
RICHARD CHRISTIE 
Columbia University! 


A current problem in personality 
theory is that of the relationship be- 
tween personality variables and sus- 
ceptibility to deviant political ideolo- 
gies. A considerable amount of 
evidence has been collected on indi- 
viduals on the right-wing of the body 
politic. A current source of frustra- 
tion for American psychologists, how- 
ever, is the paucity of relevant data 
on members of the extreme left. For 
obvious reasons, recent years have 
seen a marked shrinkage in the size 
of this population and an increase 
in sampling difficulties. It is therefore 
of interest to find that attention is 
directed toward the personality char- 
acteristics of communists, among 
others, by H. J. Eysenck of Maudsley 
Hospital, London, in his recent book 
The Psychology of Pelitics (8). 

One of Eysenck’s major conten- 
tions is that communists and fascists 
are similar in being ‘‘tough-minded” 
and ‘‘authoritarian.”’ This is a highly 
plausible hypothesis. However, a 
careful examination of his data indi- 
cates that if his measures of these 
attributes were valid, quite different 
conclusions would be drawn. 

The present critique shall be re- 


1 This article was written while the author 
was a Fellow at the Center for Advanced 
Study in the Behavioral Sciences. An earlier 
draft was substantially modified as a result of 
suggestions made by colleagues. Ramon J. 
Rhine assisted the author in statistical calcu- 
lations. 


stricted to methodological points and 
their implications for a more ade- 
quate understanding of the relation- 
ships between certain aspects of per- 
sonality and_ political ideology. 
Eysenck's interpretation of his data 
on communists and fascists is crucial 
for his theoretical schema but a 
thorough evaluation of the latter 
would add unduly to the length of 
this paper.? Detailed documentation 
in support of the present criticism 
will be presented and only those inac- 
curacies and inconsistencies of Ey- 
senck’s which are pertinent to our 
specific topic shall be cited. 


ARE COMMUNISTS AND FASCISTS 
SIMILAR IN BEING ‘‘TOUGH- 
MINDED’? 


References to the finding that 
communists and fascists differ from 
less politically deviant samples in 
being more ‘‘tough-minded”’ are scat- 
tered throughout The Psychology of 
Politics. This is considered demon- 
strated by scores on a scale designed 
to measure ‘‘tough-tender-minded- 
ness." Examination of the evidence 
indicates: (a) that no confident gen- 
eralizations are justified upon the 
basis of the sampling procedures used 
in selecting samples from the parent 
populations; (6) that the scale does 
not measure ‘‘tough-mindedness,” 

7A review of The Psychology of Politics by 
the writer may be found in The American 
Journal of Psychology, 1955, 68, 702-704. 
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at least among communists; and (c) 
that Eysenck engages in misleading 
manipulations of communist and 
fascist test scores in contrasting 
them to various ‘‘neutral groups.” 

Purported evidence for tough- 
mindedness comes from two studies. 
The first was conducted by Eysenck; 
the second was an unpublished doc- 
toral dissertation done at the Uni- 
versity of London by Thelma Coulter 
which is cited by Eysenck. They shall 
be examined separately. 

THE EYSENCK STUDY 

In The Psychology of Politics Ey- 
senck states, ‘‘When we average the 
average scores of the groups on the 
T factor, i.e., without paying atten- 
tion to the fact that the number of 
cases is different between the groups, 
we find that the Liberals are the most 
tender-minded with a score of 7.7; 
that the Socialists and Conservatives 
follow next, with a score of 7.0; and 
the combined Communist-Fascist 
group has much the most tough- 
minded score (5.5)'* (8, pp. 137- 
138). In evaluating this conclusion 
the sampling, measuring, and analy- 
sis procedures shall be treated in 
order. 


Sam pling 


The largest group of subjects were 
middle-class adherents of the Con- 
servative, Liberal, and Socialist par- 


ties. A smaller sample of working- 
class members of the same parties was 
also obtained. In addition, com- 
munist subjects were recruited from 
two branches of the Communist 
party and a few fascists were obtained 
in an unspecified manner. 

The basic middle-class sample. 
This was composed of 250 middle- 


8 Socialists refer to members of or voters 
for the British Labor Party. 
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class members of each of the three 
major British political parties. The 
clearest statement of the sampling 
procedure may be found in an article 
published in 1947 (5, pp. 53-58). 
Students in university classes, uni- 
versity extension and in 
W. E. A. classes were required to 
give from five to fifteen question- 
naires each to friends and acquaint- 
ances and have them answered. In 
this fashion 317 usable questionnaires 
were collected from individuals iden- 
tifying themselves as supporters of 
the Conservative party, 256 Liberals, 
and 409 Socialists. Three samples of 
250 each were drawn from each of the 
parties so that they were roughly 
equated for age, sex, and education. 
The respondents came from an urban 
background (5, p. 54). 

The working-class sample. No 
specific information is given as to 
how this sample was selected. It is 
inferred that they were also given 
questionnaires by members of Ey- 
senck’s classes since he notes that, 
‘The method of selection adopted has 
been explained in some detail in the 
first paper of this series... ’”’ (6, p. 
200). Since the paper referred to was 
devoted exclusively to middle-class 
respondents it cannot be determined 
whether the working-class question- 
naires were obtained at the same 
tiie as those of the middle- 
class respondents or at a later date. 
The number of protocols is much 
smaller being 65 for the Conserva- 
tives, 27 for the Liberals, and 45 for 
the Socialists (6, Table I, p. 201). 
These respondents were also urban 
but there were no controls on age, 
sex, or education (6, p. 200). 

The communist sample. The pro- 
cedure utilized in sampling com- 
munists differed since these subjects 
were recruited directly through the 
party organization. The total in- 


classes, 


were 
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formation available is: ‘Contact was 
made with Party Branches through a 
member of the Communist Party 
who undertook to collect the ques- 
tionnaire replies. He used two differ- 
ent branches, one primarily working- 
class, the other primarily middle- 
class. Relatively few refusals were 
encountered among those approached, 
in spite of a feeling that this type of 
work was ‘futile’’’ (6, p. 200). Fifty 
protocols were collected from middle- 
class communists and 96 from work- 
ing-class communists. (6, Table I, 
p. 201). 

The fascist sample. The only in- 
formation available as to the recruit- 
ment of these subjects is the single 
“Only 
persons could be found who were fol- 
Mosley and may properly 
be called ‘fascists’ "’ (6, p. 206). 


Comparisons of samples. In the 


sentence, seven middle-class 


lowers of 


earlier article dealing with the mid- 
dle-class respondents Eysenck argued 
for what he termed analytic sam- 
pling, 1.e., he was interested in com- 
paring attitudes of members of the 
three major parties when other varia- 
bles affecting 
attitudes 


or possibly affecting 
held constant—age, 
sex, education, and place of residence 
(5, pp. 53-58). 


were 


This is a legitimate 
approach and there is no quarrel with 
it. Since no significant differences 
were found to be related to the first 
three of these in the basic middle- 
class sample, controls on them were 
dropped for the working-class, com- 
munist, and fascist samples. Such a 
procedure is based upon an implicit 
assumption that if there are no rela- 
tionships between certain variables 
in one sample, there will be none in 
a sample of quite a different nature. 
Such an assumption may be valid 
but it needs to be demonstrated be- 
cause it has been shown that different 
relationships between attitudinal var- 
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iables hold in middle- and working- 
class samples (4, pp. 171-172; 10, 
pp. 58-61). In other words, differ- 
ences found by Eysenck between 
middle- and working-class adherents 
of the major parties might well be a 
result of uncontrolled factors and not 
simply a result of different class mem- 
bership. 

These same criticisms might be ex- 
pected to apply with even greater 
force to comparisons between mem- 
bers of major British political parties 
and those belonging to the commu- 
nist or fascist parties. Almond's sam- 
ple of middle-class communist de- 
fectors indicates strongly that they 
were a deviant group from 
their political idiosyncrasies (2). 
There is another problem which may 
lead to bias in 
the 
communists 


aside 


the comparisons be- 
tween and others. 
The were recruited 
through an organization which im- 
plies active political interest: adher- 
ents of the major political parties 
may or may not have been politically 
active since they classified themselves 
as to . the group in which you 
would include yourself’ when pre- 
sented with a list of parties (5, Table 
I, Q. 47, p. 78). It is well known 
that people who belong to groups 
differ in many respects from tMose 
who do not (10, pp. 61-63). Now it 
may be argued that communists are 
by definition active group members 
and thus differ from the majority of 
the rest of the population. It would 
nevertheless be extremely important 
to know whether they are less differ- 
ent from those who are politically 
active in major parties than from 
those who merely list themselves in 
a particular party when asked to do 
so. In short, to what extent are dif- 
ferences in attitudes between com- 


communists 


munists and major party members 
traceable to ideology per se and to 
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what extent to other factors relating 
to political activity? 

Eysenck’s failure fully to consider 
the implications of biases in sampling 
can best be illustrated by his discus- 
sion of the representativeness of his 
basic middle-class sample. He quite 
rightly notes that the initial results 
should not be generalized to rural or 
working-class populations. He subse- 
quently admits that all British urban 
middle-ciass people did not have an 
equal probability of being drawn 
since his students were not suffi- 
ciently widely acquainted. He then 
says, ‘‘... it seems unlikely that this 
principle would affect very many 
middle-class people, or that it would 
be correlated in any systematic way 
with the type of attitude which is 
being studied. Careful scrutiny of 
the papers written by the students, 
and verbal questioning after discus- 
sion of sampling procedures, did not 
reveal any suggestion that our sam- 
ples were seriously biased; while this 
conclusion cannot, of course, be ac- 
cepted as definite proof, it is perhaps 
near enough the truth not to affect 
our conclusions in a very serious man- 
ner’ (5, pp. 57-58). 

Itisthe present contention that very 
serious biases were present in Ey- 
senck’s basic middle-class sample as a 
result of the sampling procedures. 
Comparisons of his samples with esti- 
mates of the parent middle-class pop- 
ulation indicate this very clearly. 

First, consider the age distribu- 
tion of Eysenck’s basic middle-class 
sample. He dichotomized sample 
members as older and younger, the 
cutting point being thirty years of 
age (5, p. 57). The actual range and 
distribution of ages of respondents 
is not given (although curiously 
enough the range of ages of the stu- 
dents collecting the data is given!) 
(5, p. 54). Presumably, the respond- 
ents must have been of or almost of 
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voting age or the results would be 
almost meaningless. According to 
present calculations approximately 
20 per cent of the adult British popu- 
lation in 1951 was in the 20-29 year 
age range with 80 per cent falling 
over thirty years of age (based on 14, 
Table 7, p. 8). Yet 64.9 per cent 
(487 out of 750) of Evysenck’s re- 
spondents were under 30 years of age 
(calculations based on 5, Table 3, 
p. 78). What has happened is quite 
clear. Eysenck’s students tended to 
choose as respondents friends and 
acquaintances who were near their 
own age level and the entire distribu- 
tion was skewed toward the younger 
age groups. 

In view of this, the fact that 
Eysenck found only a slight but not 
significant tendency for the younger 
members of his sample to be more 
radical is not at all puzzling. Evsenck 
says in reference to. this 
“The failure of the old in our 
sample, to be more Conservative 
than the young, is perhaps also in 
opposition to expectation..." (5, 
p. 68). It is an elementary statistical 
principle that a truncated distribu- 
tion obscures relationships and this, 
it is suggested, is the most probable 
reason for the lack of differentiation 
between Eysenck’s so very voung, 
“young” group and his not so old, 
“old’’ group. 

A similar criticism may be leveled 
against the educational bias in this 
sample. Those ‘‘who have had a uni- 
versity education” totaled 57.7 per 
cent (computed from 5, Table 3, p. 
78)—the precise definition of uni- 
versity education, whether merely 
‘attendance or graduation is not speci- 
fied. There are no figures available 
which give the number of university 
graduates or of those with some uni- 
versity attendance in Great Britain. 

It is possible, however, to piece 
together bits of information which 


point, 
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indicate that Eysenck’s sample is 
extremely highly educated as con- 
trasted to the British middle-class. 
British census data give a detailed 
breakdown of university attendance. 
In 1950-51 there were 102,012 full 
and part-time university students in 
Great Britain who were taking 
courses (14, Table 108, p. 90) and 
17,337 first degrees were given (com- 
puted from 14, Table 111, p. 92). 
Comparable figures for the United 
States in 1950 are 2,659,021 students 
attending colleges and universities 
(16, Table 140, p. 125) and 432,058 
first degrees (16, Table 139, p. 124). 
For every British university student 
there were 26.07 American students; 
for every British first degree there 
were 24.92 American first degrees. 
When corrections for the total popu- 
lations of the two countries are made 
it is apparent that the ratio of uni- 
versity students in England and the 
United States is approximately one 
to eight or nine. It is impossible to 
determine the effects of foreign stu- 
dents upon these comparisons al- 
though it is believed not to affect the 
preceding comparisons markedly; less 
than 10 per cent of British full-time 
students in 1950-51 were from out- 
side the United Kingdom (computed 
from 16, Table 108, p. 90). 
Fortunately, for present purposes, 
recent data in the United 
States (since 1940) contain estimates 
to the amount of education re- 
ceived. Thus in 1950, 5,784,570 
Americans claimed four or more years 


census 


as 


of college (based upon a 3} per cent 
sample) (15, Table A, p. SB-12). 
Thus 5.9 per cent of the 97,403,307 
Americans over 21 years of age in 
1950 claimed college graduation. If 
roughly half the American adults 
were to be considered middle-class 
the proportion of college graduates 
would be around 12 per cent among 
them (since education is a measure of 
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class). A recent estimate by the 
Census Department indicates that 
15.4 per cent of the adult American 
population has had some college edu- 
cation (or roughly 30 per cent of the 
middle-class) (cited in 13, p. 238). 

Since age distributions and the rel- 
ative rate ef growths of institutions 
of higher learning differ slightly in 
the United States and Great Britain 
present estimates are rough. Applica- 
tion of the ratios previously deter- 
mined to the preceding figures would 
suggest that the proportion of adults 
in the British middle-class (similarly 
assuming roughly half the population 
as middle-class) having a university 
degree would not be much above 2 
per cent and that of those having 
some university education above 5 
per cent. These are rough estimates 


‘This is a crude estimate. Centers (3, 
Table 8, p. 57) broke down a 1945 Gallup 
cross-sectional sample of white males of 21 
years of age or over in the United States into 
the following groupings for urban residents 
all business, professional, and white collar 
(N =430); all urban manual (N=414). The 
rural categorization was: farm owners and 
managers (V = 153); farm tenants and laborers 
(N=69). If the initial categories are con- 
sidered middle-class, slightly over half the 
sample would be so classified. If nonwhites 
had been included the proportion of middle- 
class would presumably decrease slightly. In 
view of sampling errors (3, Table 1, p. 38) an 
estimate of 50 per cent seems a reasonable 
approximation, 

§ The question of the comparability of the 
proportions of the population in similar classes 
in Great Britain and the United States is a 
puzzling one since different criteria are appar- 
ently used by the Gallup organizations in the 
two countries. Centers found 43 per cent of 
his sample classified themselves as middle- 
class (3, Table 18, p. 77). Eysenck presents 
data on British class identification (8, Table 
III, p. 18). Present calculations indicate that 
41.7 per cent identified themselves as either 
middle or lower-middle class (in obtaining this 
figure the computed total of 8,890 was used as 
a denominator since Eysenck's addition is 
erroneous). Since subjective class identifica- 
tion is substantially correlated with external 
ratings of class membership, these figures sug- 
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but it is believed that they are not 
grossly in error. They are so far be- 
low the 57.7 per cent of Eysenck’'s 
sample that it is clear that his sample 
was completely unrepresentative of 
the British middle-classes. 

It would be tedious to continue 
demonstration of other aspects of the 
nonrepresentative nature of Eysenck’s 
middle-class sample. Such biases 
might well be expected to have a 
major effect on the attitudes elicited 
from subjects. Eysenck notes that 
the correlation between social class 
and political attitudes in Great Brit- 
ain is .67 (8, p. 19), ** . that social 
class estimates are determined al- 
most completely by social status.” 
(8, p. 20) and that, “*.. . education 

. 1s of course so closely related to 
status that the results are almost a 
foregone conclusion” (8, p. 20). 

In view of the fact that Eysenck's 
basic middle-class sample is markedly 
unrepresentative of the British mid- 
dle-classes, it would be highly danger- 
ous to project their attitudes to ob- 
tain an estimate of the parent popu- 
lations. Yet Evysenck suggests this 
possibility both in an earlier article 
(5, p. 57) and in The Psychology of 
Politics (8, p. 127). 

Of crucial importance for the pres- 
ent discussion, however, is the fact 
that the comparisons of scale scores 
among groups belonging to differ- 
ent social classes and political parties 
embody not only these differences 
but many other uncontrolled biases 
as well. No confidence can be placed 
in the generality of the differences 
in scores found among the groups 
studied with the exception of com- 
parisons within the middle-class 





gest that the proportion of middle-class indi- 
viduals in the two countries is not too dis- 
similar. However, the wording of the alterna- 
tives in the two surveys differed and differ- 
ences in the meaning of class labels in the 
countries is an unknown factor. 
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where age, sex, and education were 
roughly controlled. 


The Measurement of 
edness” 


Rokeach and Hanley (12) have 
discussed Eysenck’s T (‘‘tough-mind- 
edness’) factor.® A re-examination of 
the portions of Eysenck’s work to 
which they refer clearly indicates 
that the mean scores reported by 
Eysenck are in disagreement with 
the data which he reported. Aside 
from such computational errors, there 
are other aspects of the T scale which 
are relevant in any attempt to un- 
cover the significance of scores on the 
T scale made by samples of different 
political affiliation. 

Three biasing factors which are 
empirically related to the 
made on the T scale among the sam- 
ples examined by Eysenck have been 
uncovered. These are: (a) the treat- 
ment of the ‘“no-answer” category, 
(b) the asymmetric nature of the 
scale, and (c) the different interpreta 
tion of the items among various sam- 
ples. 


“ Tough- Mind- 


scores 


Each of these biasing effects 
shall be considered separately. 

Treatment of the ‘‘no-answer’’ cate- 
gory. In most attitude the 
treatment of neutral categories is ex- 
plicitly or implicitly based upon the 
assumption that a respondent who 
does not have an attitude, can not 
make up his mind as to the answer to 
the specific question asked, or other- 
wise does not agree or disagree, is not 
an extremist in terms of whatever the 
scale presumably measures. Likert- 
type scales are constructed so that 
such a reply (or lack of one) to an 
item is scored intermediately be- 
tween acceptance and rejection. 

The scoring system employed by 
Eysenck is based upon other (unspec- 


scales 


6 The present critique has been modified as 
a result of Rokeach and Hanley’s analysis to 
minimize duplication. 
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ified) assumptions. Nine of the 
fourteen items Fig. 1) are 
‘tough-minded”’ and five are ‘‘ten- 
der-minded."" Respondents are al- 
lowed to choose among the follow- 
ing alternatives: ‘“‘strongly approve” 
(++), ‘approve on the whole” (+), 


(see 


‘can’t decide for or against, or if vou 
think that 
quately worded" (zero), ‘disapprove 
on the whole’ (—), or 


the question is inade- 


“strongly dis- 
(8, p. 122). If the 


parti ular item being scored is one of 


approve’ (— —) 


the tive tender-minded ones, agree- 
ment, whether “on the 
whole,’” is given one point. All forms 
of neutrality 


arbitrarily 


“strong” or 


and disagreement are 
the T 
scale. If the item in question hap] ens 


scored as zero on 
to be one of the nine tough-minded 
ones, disagreement of anv varietv 1s 
Any 


responses are 


given a weight of one. form of 


agreement and ‘‘zero”’ 
given no weight 

Such a scoring system has interest- 
ing inherent properties. A respondent 
who disagreed with everything, what- 
ever the would 


cally have a score of nine. 


automati- 
Disgruntle- 


content, 
ment, in this case, leads to a score 
which is more tender-minded (by vir- 
than would 
be true in the case of one who ac- 
Acceptance how- 
of all items would result 


tue of the scoring svstem 


cepted all items. 
ever strong, 
in a maximum total score of five. If, 
however, a respondent, for re 


asons 


noted or others could not agree or 
disagree with amy item his final score 
would be exactly zero. 

The logic of a scale which classifies 
a person who disagrees with every 
item of a battery of items as high in 
and who 
‘can't decide for or against, etc.’’ as 
being at the extreme pole of tough- 
mindedness is difficult to understand. 

Of more direct interest is the extent 
to which Evsenck's use of the no re- 
affects the T-scale 


tender-mindedness one 


sponse category 
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comparisons of his samples. If there 
were no systematic differences in re- 
sponse sets among the members of 
the various samples the possibility 
of bias would be largely vitiated. 
Since, however, there are more 
tough- than tender-minded items the 
scoring of the zero category operates 
to make samples characterized by a 
high indeterminant 
tough-minded. Un- 
fortunately, Eysenck does not report 
the these 

among members of the vai 
ples. 


port 


proportion ol 
answers more 
frequency of responses 
us sam- 
He does, however, roughly re- 
the frequency of extreme re- 
sponses ——), "“... we 
find that per cent of the 
socialist, liberal and conservative re- 
have marked in this 
fashion, but 54 per cent and 51 per 
cent respectively of the middle-class 
and communist re- 
sponses” (6, p. 206). Among the 
fascists, ‘‘It interest to 
that these subjects were the 
most emphatic of all, their propor- 
and 
67 per cent’ (6, p. 

rhe 


since the communist 


7 ae 


| 


only 3 


been 


S] MOnses 


working-class 
seven is of 
note 
tion of +4 —— scores being 
207) 

that 
and fascist sam- 
ples checked more extreme responses 
than did members of the three major 
parties, they also had a lower fre- 
Such an 
inference is in agreement with the 
known relationship between ex- 
tremity and intensity of attitude. As 
Eysenck notes in a discussion of pre- 
. when the dif- 
ferent groups who had taken part in 
the study were compared there was 
a marked tendency for the more ex- 
treme groups to be more certain of 
This characteristic 
we shall find again in our discussion 


above figures indicate 


quency of zero responses. 


\ ious resear( h, 


their opinions. 


of Communist and Fascist ideologies”’ 
(8, p. 120). 

Upon the basis of available data it 
can be inferred that the conserva- 
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tives, liberals, and socialists sampled 
had a higher frequency of zero re- 
sponses than did the samples of com- 
munists and fascists. The arbitrary 
system of scoring which treated zero 
responses as tough-minded thus in- 
troduced a bias of unknown extent 
in the direction of making the mem- 
bers of the three major parties more 
tough-minded, relatively speaking, 
than those of the two deviant parties. 

The asymmetric nature of the scale. 
In the present analysis of the T scale 
considerable importance is attached 
to the fact that the items also meas- 
ure radicalism and conservatism. Of 
the fourteen items, nine have a 
higher saturation on R (radicalism 
and conservatism) than on T. No 
item is clearly an independent meas- 
ure of T. The original T scale was 
based upon a factor analysis of 40 
items responded to by the basic mid- 
dle-class sample previously discussed. 
A total of 23 items had saturations of 
+.20 or greater upon the T factor. 
It is impossible upon the basis of an 
inspection of the saturations to de- 
termine why some items were in- 
cluded in the T scale when other 
items with higher saturations were 
not (8, Table XX, p. 129). This is 
in direct conflict with what Eysenck 
says in The Psychology of Politics: 
‘“.. we must obviously construct 
measuring instruments for R and T 
respectively. Two scales were ac- 
cordingly constructed by combining 
the items most highly correlated with 
the two factors respectively, each scale 
consisting of 14 items’ (8, p. 133, 
italics mine). 

The reliability of the T scale was 
.64 (.80 when corrected by the Spear- 
man-Brown formula) and .81 on the 
R scale (.90 corrected) (5, p. 65). The 
lower reliability of the T scale is not 
surprising since the variance ac- 
counted for by each is 8 and 18 per 
cent respectively (5, p. 59), some of 
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the items in the R scale have practi- 
cally no saturation on T, and the 
lowest saturation of any R-scale item 
on R is .45 in contrast to the low of 
.20 of T-scale items on T (8, Table 
XX, p. 129). 

The crucial point in an interpreta- 
tion of Eysenck’s results is that the T 
scale is a somewhat better measure of 
R than T. The mean loading of T- 
scale items on T is .38, on R, .48 (cal- 
culated from data in 8, Table XX, p. 
129). 

If we consider, as Eysenck does, 
communists to be both tough-minded 
and radical, fascists to be tough- 
minded and conservative, conserva- 
tives to be tender-minded and con- 
servative, and socialists to be tender- 
minded and radical, certain conse- 
quences follow in the determination 
of the members of these 
parties. An examination of Fig. 1 
indicates that there are different 
numbers of items in the four quad- 
rants. When this fact is combined 


scores of 


with Eysenck’sscoringsystem, strange 
results may be expected. 
that a hypothetical consistent com- 


Assume 


munist and his fascist counterpart 
were answering the items in the T 
scale, hypothetical perfection being 
defined as being well indoctrinated 
in their respective ideologies (radical 
and conservative), and that both 
were equally tough-minded. Both 
would receive a total of exactly no 
points for rejecting the five tender- 
minded items in the T scale. The 
hypothetically perfect communist 
would receive five points for reject- 
ing the five items in the conservative 
tough-minded quadrant and no points 
for accepting (however strongly) the 
four items in the radical tough- 
minded quadrant. The consistent 
fascist, on the other hand, should re- 
ject the four items in the radical 
tough-minded quadrant (four points) 
but would receive no points for ac- 
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16. Only by going back to religion can civilization hope to survive 
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cepting the five conservative tough- 
minded items. 

By virtue of an asymmetric dis- 
tribution of items combined with 
Eysenck’s singular scoring system, 
a hypothetically consistent fascist 
is automatically made more tough- 
minded by one point than a hypo- 
thetically consistent communist. The 
confusion inherent in such a scoring 
system becomes the more puzzling 
since Eysenck persists in lumping 
fascists and communists together as 
being tough-minded. 

A similar analysis indicates that 
a hypothetically consistent socialist 
should be more tender-minded by 
one point than a hypothetically con- 
sistent conservative. Since many of 
the differences in mean scores on the 
T scale are less than a point apart, the 
preceding indication of the impor- 
tance of differential weighting aris- 
ing from the scoring system takes 
on crucial importance. The field of 
attitude measurement is bedeviled 
with enough problems without in- 
cluding scales with built-in biases 
based upon unspecified assumptions. 

Interpretation of the items. In the 
preceding section we have dealt with 
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the responses which a hypothetical 
communist might make to the T 
scale. The underlying assumption 
was that he should reject all items 
except those which were radical 
and tough-minded since these are the 
characteristics which Eysenck at- 
tributes to members of the Commu- 
nist Party. An examination of Ey- 
senck’s data indicates that the com- 
munists sampled responded in quite 
a different fashion. 

Table 1 shows the mean percent- 
age acceptance of items falling in 
these quadrants by communists and 
members of other parties (with the 
exception of the small fascist sample 
for whom data are not given). Both 
middle- and working-class commu- 
nists show markedly greater accept- 
ance of radical tough-minded items 
and greater rejection of conservative 
tender-minded items than subjects 
affiliated with other parties. These 
results are completely in line with 
Evsenck’s analysis and the expecta- 
tions of anyone familiar with politi- 
cal attitudes. 

However, the theoretically crucial 
responses communist 
to T-scale items which fall into the 


are responses 


TABLE 1 
MEAN PERCENTAGE ACCEPTANCE OF T-SCALE ITEMS BY QUADRANT, SOCIAL CLAss, 
AND POLITICAL AFFILIATION* 


Middle-Class 
Item Nos.* - 
Comm. Soc 
Tough-minded 
Radical 
29.9 23, 15 
Conservative 
£3, 4,.3, 39,5 


Tender-minded 
Conservative 
16, 28 00 
Radical 
10, 8, 36 84 





* Taken from (6, Table III, p. 203 


Lib. Cons. 


Working-Class 


Comm. Lib. Cons 
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Both middle- 
and working-class communists are 
markedly less receptive to the items 
falling in the conservative tough- 
minded quadrant than other sample 
members and more receptive to those 
items falling in the tender-minded 
radical quadrant. Examination of 
these figures suggests that the com- 
munists sampled are not responding 
to the tough- or tender-mindedness 
of T-scale items but rather to the 
radical or conservative content. Our 
calculations indicate that the middle- 
class communist sample had a mean 
acceptance of 91 per cent of the seven 
T-scale items (four tough- and three 
tender-minded) with a radical satura- 
tion and only 7 per cent of the seven 
items (five tough- and two tender- 
minded) with a conservative load- 
ing. Comparable figures for working- 
class communists are 81 and 14 per 
cent. On the other hand, if the re- 
sponses to the nine tough-minded 
items are examined, the mean accept- 
ance is 47 per cent by both middle- 
and working-class communists and 


other two quadrants. 


mean acceptance of the five tender- 
minded items is 47 and $0 per cent 
respectively. 

The present interpretation of these 
figures is simple. The communists 
sampled by Evsenck responded to the 
T scale not upon the basis of their 
loadings on tough- and tender-mind- 
edness but responded directly in terms 


of their radical-conservative loading 


The T scale simply does not apply to 


communists (or at least to those 
sampled). Comparisons of scores 
made by communists on a scale on 
which they did not respond along the 
continuum measured with scores 
made by other samples are meaning- 
less. 

Analysis. An analysis of Eysenck’s 
treatment of results also leads to 
questions of interpretation. Let it be 


assumed (as it assuredly is not) that 
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there are no problems in sampling 
or measurement in the data which 
Eysenck reports. Let it further be 
assumed (despite our agreement with 
Rokeach and Hanley’s recomputa- 
tions) that Eysenck’s addition is cor- 
rect. It is still possible to raise ques- 
tions about the manner in which the 
data are treated. 

It has been previously noted that 
the comparisons of various ‘‘groups’’ 
involved an ‘‘average of an average 
score.’’ The reported score of liberals 
on the T scale thus was based upon 
the singular procedure of adding the 
mean (7.9) of 250 middle-class lib- 
erals (as sampled) to that (7.4) of 27 
working-class liberals (as sampled) 
and dividing by two and then round- 
ing up the ‘‘average of an average’ 
of 7.65 to 7.7 in deriving the tough- 
mindedness score of liberals. 

The results obtained from the “‘av- 
erage of an average” treatment lead 
to even more remarkable results when 
applied to the combined communist- 
fascist samples. The only way in 
which the writer is able to arrive at 
the ‘‘average of an average” score re- 
ported by Eysenck is as follows: add 
the mean T-scale score of the 50 mid- 
dle-class communists to that of the 96 
working-class communists and divide 
by two which gives a score of 6.4; 
take the mean score of seven fascists 
and add it to the previous figure, di- 
vide by two, and round down the 
an average” of 5.55 to 
5.5 to obtain the figure given by 
Eysenck.’ 

Various possible comparisons of 
the scores of various samples (taking 


‘“average ol 


7 An examination of the rounding practice 
followed by Evsenck indicates a systematic 
procedure. Contrary to more customary pro- 
cedures of rounding in a consistent direction, 
to odd or even numbers, etc., Eysenck’s 
roundings in these comparisons are such as 
to maximize the discrepancy between com- 
munist-fascist and other political groupings. 
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middle- and working-class together) 
are given in Table 2. When the mean 
score is used instead of the ‘‘average 
of an average’’ on Eysenck’s reported 
sample means the communists are 
less deviant from the other groups. 
If a mean is taken on the T scores 
as recomputed by Rokeach and Han- 
ley they become even less deviant. If 
one wished to weight the middle- and 
working-class means by their rela- 
tive proportion in various political 
parties still different figures would be 
found. 
TABLE 2 
COMPARISON OF TOoUGH-MINDEDNESS 
ScorES OF MEMBERS OF VARIOUS 
SAMPLES BY PARTY 


Mean of 
RKokeach 
and 
Hanley’s 
Meanst 


Mean 
of Ey- 
»« senck’s 
* Meanst 


“Average 
of an 
Average’ 


Party 


} 


8.22 
7.51 


.78 


Liberal 
Conservative 
Socialist 
Communist} 
Fascist 


* Taken from (8, pp. 137-138). 

+ Computed from (8, Table XXIII, p. 138 
~ Computed from (12, Table 2. p. 171). 

§ Only one group 


mats =) | 


It can only be concluded that 
Eysenck presented his data in such a 
way as to maximize the differences 
between communists and fascists on 
the one hand and other political par- 
ties on the other. The differences in 
mean T-scale scores of various sam- 
ples are less than the errors that 
might be reasonably expected to oc- 
cur from sampling biases and the 
peculiarities of the scoring system. 
It is impossible to place any reliance 
in the T-scale differences among vari- 
ous samples even if Eysenck’s unu- 
sual arithmetic practices are replaced 
by more conventional techniques. 


THE COULTER STUDY 


Coulter's 


Eysenck believes that 
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study represents confirmation of his 
own findings. He states, ‘‘These re- 
sults (Coulter's) bear out in every 
detail the results of the previous 
study (Eysenck’s), and we may ac- 
cordingly conclude that our main 
hypothesis is strongly supported” 
(8, p: 142). This assertion is believed 
to be unjustified upon the basis of 
available data.® 


Sampling 

Coulter gave a battery of tests to 
three samples. All were composed 
of British working-class males (8, p. 
142). One was a “neutral’’ sample 
of either 86 (8, pp. 142, 202) or 83 
soldiers (8, p. 152). The criteria for 
selection are not specified by Eysenck 
although he states that they “‘. .. con- 
stituted a fairly random sample of 
the British working-class males’ (8, 
p. 142). No information is given as 
to whether these soldiers were volun- 
teers or conscripts. Since military 
samples underrepresent older age 
groups and those older men in the 
Army tend to be ‘Old Army Men” 
who are certainly not typical of the 
working-class population, it is most 
unlikely that such a group would even 
roughly approximate a random sam- 
ple of the working-class. 

Coulter's communist and fascist 
samples were each composed of 43 
working-class males. As far 
known, no reliable estimates as to 


as 1S 


8 Discussion of these data is restricted to 
what Eysenck reports concerning them in The 
Psychology of Politics. Neither Coulter nor 
Melvin’s theses which are germane to the 
topic have been published. A copy of Coul- 
ter's thesis was examined after completion of 
this critique. It has not been necessary to 
modify present criticisms. Eysenck refers (8, 
p. 276) to “... Melvin (1954)....° The 
only Melvin listed in the bibliography (8, p. 
301), is, “Melvin, D. An experimental and 
statistical study of two primary social atti- 
tudes. Ph.D. Thesis, Univ. London Lib., 
1953." According to a letter dated Feb. 14, 
1955, from the University of London Library, 
no such thesis had been filed and no informa- 
tion was available concerning it. 
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the characteristics of the parent pop- 
ulations being sampled exist. It is 
therefore impossible to determine the 
representativeness of these samples. 


Measurement 


Coulter used a revised set of R and 
T scales devised by Melvin. The 
latter started with a pool of 60 items 
and factor analyzed the question- 
naires of 650 respondents of unspeci- 
fied origin (8, p. 132). Twenty of the 
forty items used by Eysenck were in- 
cluded—based upon a comparison of 
(6, pp. 208-209) and (8, pp. 277-279). 
Of these eleven were used as measures 
of both R and T by Eysenck (Items 
L,: 3; 8,9; 45, 16; 23, 28, 29, 36, and 
39), three as measures of R only 
(Items 12, 26, and 27), three as meas- 
ures of T only (Items 5, 10, and 13), 
and three were not included in the 
original R and T scales although they 
were in Eysenck’s pool of 
(Items 6, 18, and 35). 

Melvin added another 40 


items 


items 


—although Eysenck says there were 
38 (8, p. 132). An inspection of these 
indicates that they are fairly similar 
to the ones originally used by Ey- 


senck. In the new R and T scales, the 
R scale was expanded to 16 items and 
the T scale underwent drastic revi- 
sion and was expanded to 32 items. 

Of the eleven items measuring both 
R and T in Eysenck’s scaling system 
only two were used in the same fash- 
ion by Melvin (Items 29 and 35). 
One was used as a measure of T alone 
by Melvin (Item 23). The other 
eight did not emerge in Melvin's 
scales. The three original measures 
of T alone are not included in Mel- 
vin's T scale. Two of the original 
measures of R alone used by Eysenck 
perform the same function in Mel- 
vin's revision (Items 21 and 27). The 
other (Item 26) is used by Melvin to 
measure both R and T. 

The scoring system used by Melvin 
is identical to that used in Evsenck’'s 
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original T scale. Twenty of the 32 
items are in the tough-minded direc- 
tion, twelve in the tender-minded 
direction (8, pp. 276-279). It is im- 
possible to determine from the ma- 
terial Eysenck presents whether the 
asymmetry which served as a source 
of bias in his scale is also present in 
Melvin’s and this question cannot 
be answered due to the unavailability 
of the latter's thesis. It is clear, how- 
ever, that the criticism made previ- 
ously of the bias resulting from dif- 
ferential group response sets favoring 
utilization of the no response or zero 
category applies to Melvin’s revision. 

Eysenck notes that the split-half 
reliabilities of Melvin's revision of 
the R, T, and E (emphasis) scales 
lie between .85 and .95 in “...a 
relatively unselected group” (8, p. 
2 It is Eysenck’s contention that 
Melvin's research ‘‘ .. . showed that 
our original results could be repro- 
duced with an entirely different set 
of items’ (8, p. 132). Data are not 
available to evaluate the accuracy of 
this statement but the point is not 
germane to the present argument. If 
the scales are measuring the same 
dimension there is no reason to be- 
lieve that the uncritical application 
of the T scale to communists would 
not be subject to the same bias as 
that demonstrated in Eysenck’s work. 
If, on the other hand, they are meas- 
uring something different we are left 
in an even more puzzling situation as 
to what comparative scores on the 
tests mean. 

Analysis. The means of the “‘neu- 
tral,"’ communist, and fascist groups 
on the T scale are not given by 
Eysenck. However, the distribution 
of scores of the latter two groups is 
given and a point is presented which 
represents the mean score of the 
‘neutral’ group (8, Fig. 26, p. 141). 
Our calculations indicate that the 
means for the various groups are as 
follows: ‘‘neutral,”’ 14.2 (interpolated 


“dj. 
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approximation); communists, 11.05; 
fascists, 7.85. The striking point in 
this ordering is that the communists 
fall almost exactly midway between 
the ‘‘neutral’’ and fascist samples be- 
ing 3.15 units from the former and 
3.2 units from the latter. Standard 
deviations computed from the dis- 
tributions indicate that the scores of 
the fascists and communists differ 
significantly (CR of 4.18). 

What this difference means is, of 
course, completely puzzling since 
there is no reason to suppose that the 
same vitiating circumstances which 
made the earlier comparisons mean- 
ingless do not apply with equal 
cogency. We see no greater reason 
on the basis of the data for lumping 
communists and fascists as different 
from a ‘‘neutral’’ group than for dif- 


ferentiating fascists from ‘‘neutrals’’ 
and communists. 

There is one further matter which 
Eysenck does not touch upon but 


which makes the comparison of 
Coulter's samples somewhat ques- 
tionable. It was noted that a sample 
of soldiers was not very apt to be 
representative of working~lass males. 
Examination of the mean R-scale 
score of this group raises the possi- 
bility that this is a most unusual 
group of soldiers. 

Melvin’s R scale is eight-sevenths 
the length of Eysenck’s. If we as- 
sume that acceptance of the items 
tends to be about the same on the 
two scales we can compare various 
groups within the two studies. The 
plausibility of such an assumption is 
indicated by the fact that Eysenck’s 
working-class communists had a mean 
score of 10.7 on R (8, Table XXIII, 
p. 138). Multiplying this figure by 
eight-sevenths we find a projected 
mean of 12.23 as the hypothetical 
value of working-class communists on 
Melvin’s revision. The actual mean 
computed is 12.90 on Coulter’s sam- 
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ple (based on 8, Fig. 26, p. 141). 
When we apply the same correction 
to the SD of Eysenck’s sample and 
our own calculations of the SD of 
Coulter's data, a test of significance 
indicates that Eysenck’s and Coulter's 
communist samples do not differ 
significantly in radicalism. This 
conclusion rests, of course, upon two 
assumptions—that the units of meas- 
urement in Eysenck’s and Melvin's 
scales are similar and that the two 
samples are comparable. 

A similar projection cannot be 
made for the fascists since Eysenck’s 
lone seven were middle-class and 
Coulter’s were working-class. What 
is of pertinence is the projection of 
scores of working-class members of 
the major parties. Using the same 
procedure as with the communists 
we find that the extension of Ey- 
senck’s R-scale means (8, Table 
XXIII, p. 138) yields projections as 
follows for Melvin’s scale: conserva- 
tives, 3.2; liberals, 4.2; and socialists, 
7.3. If we weight these groups by 
their representation in the British 
working-class population (as given 
in 7, p. 57) we find a projected mean 
of 5.8 on the R scale. Yet Coulter's 
sample made an R-scale score of 
10.8 (interpolated from 9, Fig. 26, 
p. 141). 

Why a group of soldiers in the 
British Army should be so strikingly 
more radical than would be expected 
on the basis of Eysenck’s own findings 
is extremely puzzling. It certainly 
does not argue for the generality of 
any conclusions based upon a com- 
parison of scores made by other 
groups with their own. The lack of 
internal consistency found in the 
analysis of data reported by Eysenck 
clearly indicates flaws in method- 
ology. Some of these we can pinpoint 
with reasonable accuracy; others are 
not so easily traceable since the es- 
sential data are not given. 
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ARE COMMUNISTS AND FASCISTS 
SIMILAR IN BEING “ AUTHOR- 
ITARIAN’'? 

Among the scales given to Coulter's 
samples was the California F scale 
(the particular form used is not speci- 
fied). In discussing this instrument 
Eysenck says, “It was entitled the F 
scale because Adorno et a/. considered 
it to be a measure of Fascist poten- 
tial. This interpretation, however, 
as we shall very soon see, is in part 
at least erroneous as we have found 
Communists to make almost as high 
scores on this scale as Fascists, and 
consequently we shall in this book 
refer to the F-scale rather as the 
authoritarianism scale’ (8, p. 149) 

A few pages later he reports that 
the ‘“neutral”’ 
group, communists, 94; 
fascists, 159 (8, pp. 152-153 
obvious fact 


F-scale scores were: 


15: and 
The 
the com- 
. make almost as 
this 
since the difference be- 
tween communist and fascist scores 
is an extremely large 65 points where- 
as the communists differ from the 
“neutral” group by only 19 points. 
Once Evsenck arbitrarily 
lumps communists and fascists to- 
gether in an attempt to indicate their 
similarity. 


here is that 
munists do not “* . 
high 
CSc sa 


scores on scale as Fas- 


again, 


There are some singularly curious 
things about the F-scale scores which 
Eysenck does not dwell upon. The 
items means for the three samples 
are: ‘“‘neutral’”’ group, 2.5; commu- 
nists, 3.13, and fascists, 5.30 (our cal- 
culations). The range of possible 
1.0 to 7.0 with 4.0 
representing the theoretical neutral 
point. The mean is 
below this point indicating a general 
tendency to reject the items. The 


scores is from 


communists’ 


fascists have a high acceptance score 
which represents a striking confirma- 
tion of the validity of the F scale as a 


measure of fascistic attitudes and 
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bluntly refutes Eysenck’s contention 
that the F scale does not measure 
fascist potential. 

The ‘neutral’ group which is so 
fascinatingly aberrant again demon- 
strates its uniqueness. The score of 
2.50 is the second lowest score obtained 
in roughly 50 samples with which the 
writer is familiar. What makes this 
fact so interesting is that this was a 
working-class sample and working- 
class samples tend to make higher 
scores on the F scale than comparable 
middle-class samples (the correlation 
between scores and educa- 
tion, usually part of class definition, 
has been estimated as being between 
—.50 and —.60 for American sam- 
ples) (4, pp. 168-170). The only 
known group making a lower score 
than neutral group con- 
sisted of 26 graduate students at the 
University of California (with an 
average of 6 semesters of graduate 
work) who refused to sign a special 
loyalty oath. This group made a 
score of 1.88 (9, pp. 124-126). (A 
comparison group of signers with 
similar education scored 2.73 which 
is higher than Coulter's “neutral” 
group.) 


F-scale 


Coulter's 


American college students usually 
score in the 3.0 to 4.0 range on the F 
scale (1, Table 12 (VII), p. 266; 11, 
Table 9, p. 245). If American find- 
ings are applicable to British popula- 
tions we should expect a representa- 
tive sample of British working-class 
males to make even higher scores. 
Such an expectation would seem to 
be in accordance with Eysenck’s own 
data since an examination of the pro- 
portionate acceptance of T-scale 
items by working-class samples as 

with middle-class sam- 
Table 1) indicates that 
among sample members of all parties 
the working-class respondents ac- 
cepted more of the items falling into 
the tough-minded conservative quad- 


constrasted 


ples (see 
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rant. These have high similarity to 
the sorts of items that enter into the 
F scale and the correlated E and 
PEC scales. 

It is of obvious importance to have 
comparative data on British samples. 
In so far as the writer knows, there 
is no published material of this sort. 
However, Rokeach has recently been 
in Great Britain and administered 
the F scale as well as other measures. 
The item means on two samples of 
university students were 3.26 (NV 
= 80) and 3.57 (N =137). These find- 
ings suggest no marked differences 
between American and British uni- 
versity students on the F scale. A 
group of 60 workers at Vauxhall 
Motors made an item mean on the F 
scale of 4.74.° This higher score by 


working-class men as contrasted with 
a college sample is in accordance with 
American findings and does not allay 
suspicion that there was something 
extremely unusual about Coulter's 
‘neutral’? working-class sample. 


Eysenck reports that Coulter's 
sample of communists scored higher 
on the F scale than the politically 
“neutral’”’ group. Available evidence 
indicates that communists tend to 
score lower than members of other 
political parties on the F scale. It 
can be argued that Eysenck’s own 
material supports the latter point 
of view. An examination of Table 
1 indicates that they were the least 
accepting of all samples of members 
of various parties when it came to 
the items in the conservative tough- 
minded quadrant which, as has been 
noted, are similar to F-scale items in 
meaning. It has also been argued 
that the selection of items in the T 
scale was apparently somewhat ca- 


* Rokeach, M., Personal communication. 
1955. The full implications of Rokeach's 
findings will be developed in his forthcoming 
monograph on political and religious dogma- 
tism. 
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pricious. If we examine the items in 
this quadrant which were not in- 
cluded in the T scale—membership 
being determined from R and T fac- 
tor saturations as given in (8, Table 
XX, p. 129)—we find that a highly 
similar pattern of acceptance occurs. 
The mean percentage acceptance of 
the five T-scale items in this quadrant 
by working-class communists is 18, 
while of the seven non-T-scale items 
—17, 22, 26, 27, 30, 31, and 33—it 
is 20. Similar comparisons on the 
most similar group, the socialists, is 
44 and 55 per cent acceptance for 
items included and not included in 
the T scale (computed from 6, Table 
III, p. 203): 

Direct comparisons with American 
samples are not available. Although 
a sprinkling of communists was in- 
cluded in the samples described in 
The Authoritarian Personality, their 
F-scale scores were not reported. 
However, their scores on the E scale 
were given as well as the correlation 
between the E and F scales in the 
samples in which these communist 
subjects were included. Upon the 
basis of this data it is clear that these 
communist subjects scored extremely 
low on the F scale (see 4, pp. 130- 
133, for a fuller discussion as well as 
the congruence of such a conclusion 
with earlier work with communist 
responses on Stagner’s measure of 
fascism). 

We are therefore in complete dis- 
agreement with Eysenck’s conclu- 
sions that the F scale: (a) measures 
“authoritarianism” instead of po- 
tential fascism, or (b) that commu- 
nists make higher F-scale scores than 
samples of members of less extreme 
political groups. The only support 
for his position comes from the in- 
credibly low F-scale score purport- 
edly made by Coulter's “neutral” 
group. By using this score as a base 
and ignoring the implications of his 
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own data and the research of others 
he arrives at a conclusion which we 
believe to be untenable. 


FURTHER CONSIDERATIONS 


It is clear that Eysenck’s commu- 
nist samples are neither ‘‘tough- 
minded” nor “‘authoritarian’’ when 
the data produced as evidence by 
Eysenck are carefully examined. Our 
analysis clearly indicates that com- 
munists respond to T-scale items 
simply in terms of the radical-con- 
servative loading and not in tough- 
or tender-minded fashion. This is a 
graphic illustration of the danger in- 
herent in assuming, as Eysenck ap- 
parently did, that a scale which pre- 
sumably measures one thing in a 
“normal” population (in the statisti- 
cal sense) measures the same thing in 
a radically different population. 

The point may be clarified by con- 
sidering a specific example. Item 10, 
“It is wrong that men should be per- 
mitted greater sexual freedom than 
women by society’’ may be inter- 
preted in alternate ways. It may be 
that the ‘wrong’ is based upon the 
premise that neither men nor women 
should be allowed sexual freedom; 
violation of this standard is therefore 
“wrong.” This is apparently the in- 
terpretation of the item made by 
Eysenck’'s basic middle-class sample 
since the factor analysis of their re- 
sim@mises placed the item on the ‘‘ten- 
der-minded” side, which is charac- 
terized by acceptance of religious and 
ethical items. An alternative inter- 
pretation is also possible. The item 
might be accepted by those who be- 
lieve that both men and women 
should be allowed sexual freedom 
and it is ‘‘wrong"’ to restrict the sex- 
ual freedom of women. This, it is 
suggested, might lie behind the fact 
that the communists were most ac- 
cepting of this ‘‘tender-minded”’ item 
(see 12, Table I). This pessible ex- 
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planation is supported by the fact 
that the communists sampled were 
much more approving of companion- 
ate marriage (Item 29) than any 
other group. 

Rokeach and Hanley'’s argument 
that Ferguson's “religionism’’ and 
‘“humanitarianism™ factors account 
for Eysenck’s data better than the 
R and T factors is convincing. This 
can be easily demonstrated for T- 
scale items by an examination of Fig. 
1. However, it is also difficult not to 
appreciate the clear-cut radical-con- 
servative axis that appears in Ey- 
senck’s data and to agree with 
Eysenck that there are semantic ad- 
vantages in using R and T when deal- 
ing with political parties. It is con- 
tended that what weakens Eysenck’'s 
position is the fact that he has no 
items which are relatively pure meas- 
ures of T. It is further argued that 
this is a direct, consequence of his 
original procedure. 

Eysenckoriginally collected, ‘‘From 
a total of some 500 items, all those 

. which had been shown to be of 
importance or relevance in any previ- 
ous research. When pruned of dupli- 
cations, it was found that the items 
did not suffice to make up the mini- 
mum number considered requisite, 
and others were added by random 
selection until 40 items altogether 
had been chosen” (8, pp. 121-122). 
It isextremely difficult to believe that 
the 40 items used exhaust the range 
of possibly relevant or important so- 
cial attitudes (see 8, Table XVIII, 
pp. 122-124). 

It is therefore pertinent to ques- 
tion the consequences of Eysenck’'s 
original item selection procedures. 
If, instead of taking items which had 
been of relevance in previous research, 
he had analyzed the definition of 
tough-mindedness and then selected, 
invented, or modified items which 
appeared relevant, and then factor 
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analyzed responses to them and 
other items, he might well have iso- 
lated a much purer dimension of 
“‘tough—tender-mindedness.’" Such 
a comment implies that there are 
such items or they might be found. 
Machiavelli makes many statements 
that are tough-minded, to say the 
least, but are not concerned with sex, 
religion, nor punitive reflections upon 
man (as are Eysenck’s tough-minded 
items). Whether a tough-mindedness 
scale could be constructed whose 
items are relatively independent of 
radicalism-conservatism or not, is an 
empirical question. 

Although this is as yet an unre- 
solved problem it has a great deal to 
do with what is a key hypothesis in 
Eysenck’s theorizing: ‘“‘... there is 
in truth only one ideological factor 


present in the attitude field, namely 


of Radicalism-Conservatism. 
T-factor itself does not consti- 
an alternative ideological system 


that 
The 
tute 


but is rather the projection on to the 


social attitude field of a set of person- 
ality variables” (8, p. 170). It is sug- 
gested that Eysenck was forced into 
the above position as a consequence 
of his original selection of items which 
did not cover aspects of tough- and 
tender-mindedness which were rela- 
tively independent of radical-con- 
servatism. Jf such items exist, then 
he might have found two ideological 
factors, the one radical-conservative 
and the other a means-ends dimen- 
sion. Does one take an amoral atti- 
tude in implementing political ideol- 
ogy (be it radical or conservative) or 
is there a concern with ethics and 
principles? 

It is of interest to note what per- 
sonality variables Eysenck believed 
were relevant to political attitudes. 
He suggests, ‘‘... ‘tough-minded- 
ness’ is a projection on to the field of 
social attitudes of the extraverted per- 
sonality type, while ‘tender-minded- 
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ness’ is a projection of the zntroverted 
personality type” (8, p. 174). Thus 
Eysenck believes that communists 
and fascists are extraverted whereas 
conservatives and socialists are in- 
troverted. Evidence comes from 
Coulter’s study in which TAT rat- 
ings on extraversion gave the com- 
munist and fascist samples a higher 
score than members of the ‘‘neutral”’ 
group (8, p. 180). The dangers of 
comparisons utilizing scores made by 
the latter group have already been 
indicated. 

Eysenck also cites an unpublished 
study by George in which ccrrela- 
tions between introversion-extraver- 
sion and T were found. There was no 
marked relationship with R. Neither 
R nor T was related to Evysenck’'s 
other personality factor of neuroti- 
cism (8, pp. 177-179). 

It is impossible to confirm or deny 
Eysenck’s hypotheses in any conclu- 
sive fashion upon the basis of avail- 
able data. As an alternative it is sug- 
gested that both radicalism and a 
true tough-minded amoral syndrome 
might well be related to personality 
factors but that the relationships de- 
pend upon the social setting. Any 
attempt to relate personality varia- 
bles to political ideology without tak- 
ing the social context into account 1s 
apt to be highly misleading as well as 
an oversimplification of some highly 
complex interrelationships. Thus in 
a study of communist defectors, 
Almond (2) reports marked differ- 
ences between middle-class and work- 
ing-class members in the patterns of 
motivation leading their entry 
into the party. The former, on the 
basis of analyses of interviews, were 
characterized by a high incidence of 
neuroticism, the latter were not. 
These personality differences were 
also related to the type of role played 
in the party since the screening and 
training of party members led to 


to 
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quite marked role differentiation. 
Those who became communist elites 
were quite different personality-wise 
from those who did not. Almond 
presents a convincing argument indi- 
cating that different sorts of indi- 
viduals are attracted the Com- 
munist Party in different countries, 
at different historical periods (before 
and during the Popular Front pe- 
riod), that in some countries minority 
members are attracted and in others 
they are not, and a host of relevant 
social and historical factors are oper- 
ative in causing people to join the 
Communist Party. 

Any simple statements about the 
‘communist personality”’ can fairly 
be said to reflect a lack of apprecia- 
tion for the complex social processes 
involved in ideological deviance. This 
is mot to say that members of the 
Communist Party are mot unique 
ilong certain personality dimensions 
This is not to say that communists 


to 


and fascists have no personality char- 
acteristics in common which differ- 
them from the ‘‘political 
normal” population. The point be- 
ing emphasized is that there is a wide 
range of diversity among members of 
communist and parties and 
any broad generalizations about the 
characteristics communists 


entiate 


fascist 


of and 


fascists which are based upon limited 
samples are highly suspect. 

Despite profound disagreement with 
Eysenck’'s methodological capricious- 


ness and his restricted theoretical 
position, there are some valuable in- 
sights which can be derived from a 
critical analysis of his data. It is ap- 
parent that communists differ from 
others in the importance of the radi- 
cal-conservatism dimension in re- 
sponding to items. It is also clear 
that he has provided, albeit unwit- 
tingly, compelling evidence that the 
F scale actually measures fascistic 
ideology. 


OF THE 
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What is especially interesting about 
Eysenck’s data is the fact that it 
clearly refutes any notion that com- 
munists are mirror images of fascists. 
The communists sampled are mark- 
edly different not only from adher- 
ents of the major political parties 
but from fascists as well. The gen- 
eralizability of what has been in- 
formally called the ‘‘Budenz-Bentley 
syndrome” (authoritarians of the 
right and left are similar so it is easy 
to switch from one extreme to the 
other) is not supported. It should be 
noted that Almond’s data also refute 
this hypothesis. He found that only 
10 per cent of his sample of com- 
munist defectors became _ religious 
converts or returnees, members of 
the extreme right, or conservatives. 
The majority (53 per cent) became 
moderate socialist or trade unionists, 
6 per cent remained on the extreme 
left, 18 per cent were politically indif- 
ferent, and the remaining 13 per cent 
were classified “other” or ‘“un- 
known” (2, Table 15, p. 357, and 2, 
p. 357). 


as 


The. present critique has focused 
upon Evsenc k's treatment of com- 
munists and fascists along the dimen- 
sions of tough-mindedness and au- 
thoritarianism. It would be grossly 
unfair as well as misleading to imply 
that Eysenck considers these samples 
as similar on all other dimensions. He 
Coulter's of TAT 
protocols in which it was found that 
the correlation between ratings of 
direct vs. indirect aggression was 
—.94 among the communists sam- 
pled and +.61 among the fascists 
sampled (8, p. 205). 

Since Coulter's thesis has not been 
published, a more detailed methodo- 
logical analysis is inappropriate at 
the present time. Her research is of 
interest, however, not only because 
of the many striking relationships 
found (as the magnitude of the cor- 


cites analysis 
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relations cited by Eysenck indicates) 
but also because she utilized a bat- 
tery of diversified instruments. Criti- 
cism of the use of a ‘‘neutral’’ group 
of a highly atypical nature as a basis 
for comparisons does not necessarily 
imply that Coulter's actual findings 
are not valuable. 


SUMMARY 


Eysenck’s treatment of the per- 
sonality of communists has been sub- 
jected to detailed analysis in the pre- 
ceding pages. It is concluded that: 

1. The samples studied are not 
representative of the parent popula- 
tions, that there is differential bias 
in the sampling of various groups, and 
that generalizations drawn from these 
samples are therefore unwarranted. 

2. The ‘‘tough-mindedness”’ scale 
leads to misleading comparisons 


among members of various political 
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parties because of biases built into 
the scoring system. Further, the T 
scale clearly does not measure tough- 
mindedness among the communists 
sampled since they responded to in- 
dividual items in terms of their radi- 
cal-conservative loading. 

3. The contention that commu- 
nists are ‘‘authoritarian’’ as measured 
by the F scale is unjustified since it 
is based on the comparison of a com- 
munist sample with a highly aberrent 
“neutral"’ group. 

4. Procedures which are utilized 
to differentiate communists and fas- 
cists from other samples are highly 
irregular and violate the data. 

5. A re-examination of the data 
indicates that the communists and 
fascists sampled differed from one 
another in crucial aspects as well as 
being different from the various com- 
parison groups sampled. 
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To have one’s writings submitted 
to a very detailed and exhausting 
critique in the pages of the Psycho- 
logical Bulletin is a great honor; 
to have this happen twice is some- 
what overwhelming. Before, there- 
fore, replying to Christie’s comments 
(1) I would like to take this oppor- 
tunity of thanking both him and my 
earlier reviewers (6) for drawing at- 
tention to several minor misprints 
in The Psychology of Politics (4). 
While, as will be seen, I cannot agree 
with any of the major criticisms put 
forward, | shall always be indebted 
to them for their painstaking ex- 
amination of the details of my book.! 

It is curious how much alike 


Christie (1) and Hanley and Rokeach 
(6) are in their failure to deal with 


the logical development of the 
theories and experiments outlined in 
this book (4). Psychological theory 
and factorial studies agreed in show- 
ing that the interrelations of social 
attitudes in Great Britain required at 
least two orthogonal factors 6r di- 
mensions for their description; these 
factors were labeled R (for radical- 
ism-conservatism) and T (for tough- 
mindedness vs. tender-mindedness). 
Many theoretical and practical rea- 
sons are given why, descriptively, 
these two factors are superior to any 
of the innumerable alternative rota- 
tions which could be made, and 
Christie appears to agree with this 
when he says that it is “difficult not 


1 Some of the points Christie makes have 
already been answered in my earlier reply to 
Rokeach and Hanley (5). The reader may 
like to consult this earlier paper in conjunction 
with the present one. 


to appreciate the clear-cut radical- 
conservative axis that appears in 
Eysenck’s data, and to agree with 
Eysenck that there are semantic ad- 
vantages in using R and T when deal- 
ing with political parties.” 

Our theoretical position leads us 
to believe that the T factor is the 
projection onto the attitude field of 
the personality dimension of extra- 
version-introversion, in the sense 
that extraverts will have tough- 
minded attitudes, introverts tender- 
minded attitudes. The content of the 
attitudes of extraverts and introverts 
respectively will be determined by 
their position on the radicalism-con- 
servatism axis. It would follow from 
this hypothesis that there should be 
very few, if any, pure T items; tender- 
mindedness and tough-mindedness 
should always appear in conjunction 
with either right-wing or left-wing 
tendencies. This is what we have 
found in actual fact after an examina- 
tion of many hundreds of different 
items. It is very satisfying to find 
hypotheses supported in this way, 
yet oddly enough Christie appears to 
hold the opposite view. He writes 
“It is contended that what weakens 
Eysenck's position is the fact that 
he has no items which are relatively 
pure measures of T."’ The fact that 
if many such items could be found: 
the theory which has been elaborated 
in The Psychology of Politics would 
be, not just weakened, but completely 
disproved, does not seem to occur to 
Christie. He blames our procedure 
of item selection for our failure to 
find pure T items, and says that if 
the writer ‘‘had analyzed the defini- 
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tion of tough-mindedness and then 
selected, invented, or modified items 
which appeared relevant and then 
factor analyzed responses to them 
and other items he might well have 
isolated a much purer dimension of 
‘tough—tender-mindedness.’ Such 
a comment implies that there were 
such items or they might be found. 
Whether a_ tough-mindedness 
scale could be constructed whose 
items are relatively independent of 
radicalism-conservatism or not, is an 
empirical question.”’ : 

Having attempted for many years 
to do what Christie advocates, and 
having had several students make 
similar attempts, all without success, 
the writer believes that Christie is 
somewhat optimistic. Perhaps if he 
had himself some practical experience 
in carrying out work of this kind he 
might be less inclined to dismiss the 
concentrated efforts of several people 
over many years in this superficial 
fashion. It is impossible for the 
writer to prove a negative, i.e., to 
prove that such pure T items do not, 
in fact, exist; all that can be done is 
to carry out the search over a long 
enough period and wide enough field 
to make one’s failure to find such 
items convincing evidence to the un- 
prejudiced judge. Christie's critique 
would have gained considerably if he 
had shown some appreciation of the 
methodological position, and even 
more if he had actually succeeded in 
unearthing such items. 

Granted that hitherto no pure T 
items have emerged, and granted also 
that the dimensional analysis of the 
attitude field requires two dimen- 
sions, it is clearly essential for the 
construction of a T scale to use items 
having reasonably high correlations 
with T, and which are selected in 
such a way that their correlations 
with the Rscale balance out. Christie’s 
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comment on this is that ‘‘The crucial 
point in an interpretation of Eysenck’s 
results is that the T scale is a some- 
what better measure of R than T. 
The mean loading of T scale items on 
T is .38, on R .48.”) The confusion 
evident in this quotation appears to 
invalidate most of Christie’s argu- 
ment as far as it relates to the con- 
struction of the T scale. The crucial 
point is that items are selected in 
such a way that if we have two tough- 
minded items one would be a radical, 
the other one a conservative item. By 
adding the two we add the T vari- 
ances and cancel out the R variances. 
As an example of this, let us consider 
an imaginary miniature scale con- 
sisting of two items. The first item 
relating, say, to trial marriages has a 
loading of +.6 on R and +.5 on T; 
the other item relating, say, to the 
death penalty has a loading of —.6 
on Rand +.5o0n T. For the purpose 
of the T scale a ‘“‘Yes’’ answer would 
in each case be counted one point. A 
person saying ‘‘Yes’’ to both ques- 
tions would therefore get a score on 
the T scale of 2, a person answering 
“No” to both questions would get a 
score of zero. The fact that both 
items have higher correlations with 
R than with T does not mean that 
the sum of the answers is a good 
measure of R. A person high on R 
would “Yes to the first and 
“No” to the second item; a person 
low on R would reverse this. This 
point would seem too elementary to 
discuss in such detail, but as much 
of Christie’s critique is based on it, 
it seemed desirable to clear it up. 
Rokeach and Hanley appear to be 
subject to a similar error of interpre- 
tation. If Christie were, in fact, cor- 
rect in his contention that the T 
scale is a good measure of R, then it 
should correlate with the R scale. As 
the studies reported in The Psychology 


say 
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of Politics (4) show, no such correla- 
tions have in fact been observed. 

The writer would readily admit 
that our first version of the T scale 
fell short of perfection in several 
respects; this was one reason why 
an improved version was construct- 
ed by Melvin (9). However, Christie 
is in the unfortunate position that 
if we completely accepted his criti- 
¢cism of the scoring system adopt- 
ed, then our results would support 
even more strongly our own 
pothesis, and go counter to his. He 
maintains that “by virtue of an 
asymmetric distribution of items 
combined with Evysencks singular 
scoring system, a hypothetically con- 
sistent fascist is automatically made 
more ‘tough-minded’ by one point 
than a hypothetically consistent com- 
munist.’” As we have throughout 
found communists to be slightly less 
tough-minded than fascists, Christie's 
argument would suggest that, in fact, 
we should increase the communists’ 
one point, thus making 
them even more like the fascists than 
appears in our results. As Christie's 
main argument appears to be that 
communists are not tough-minded at 
all, and are quite unlike fascists in 
this respect, acceptance of his criti- 
cisms of our scoring system would, 
therefore, strengthen our 
and weaken his. 

The same is true when we look at 
another comment. Christie main- 
tains that “the arbitrary system of 
scoring Which treated zero responses 
as ‘tough-minded’ thus introduced a 
bias of unknown extent in the direc- 
tion of making the members of the 
three major parties more ‘tough- 
minded,’ relatively speaking, than 
those of the two deviant parties.” 
Again, even if Christie's criticism 
were well taken, it would merely 
mean that we had loaded the dice 


hy- 


scores by 


position 
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against our own hypothesis; making 
the appropriate corrections would 
make our results support our theory 
even more strongly. 

Another criticism of the scoring 
system the writer does not under- 
stand at all. Christie maintains that 
“the T scale simply does not apply 
to communists (or at least to this 
sample). Comparisons of scores made 
by communists on a scale on which 
they do not respond along the con- 
tinuum measured with scores by 
other samples are meaningless.”’ Just 
what is meant by saying that a cer- 
tain scale ‘simply does not apply” to 
a certain group? One might imagine 
that it would have zero, or at least 
quite low reliability for that group; 
yet Coulter has shown that the relia- 
bility of the T scale is higher for the 
communists than for fascists, or our 
neutral group (2, p. 43). Does it, 
perhaps, mean that our measurement 
of T is only a watered-down and less 
reliable measure of R? The relia- 
bility of the T scale for communists 
is higher than that of the R scale, 
and the two scales do not correlate. 
Does it, perhaps, mean that T does 
not correlate with other variables in 
the case of communists, while it does 
so in the case of fascists and other 
groups? Again, Coulter (2) has 
shown that the opposite is true, if 


anything. Is it that scores on the 


scale do not behave in conformity 


with firmly grounded theory? But 
here again, as shown in The Psychol- 
ogy of Politics (4) and the more re- 
cently concluded study by Nignie- 
witzky (10), to be discussed below, it 
is found that communists behave pre- 
cisely in the predicted manner. 

Is it that the T scale is irrelevant to 
political party structure as compared 
with the R scale? Here, Nigniewit- 
zky's finding on a _ representative 
sample of the French middle-class 
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population is relevant; he finds that 
the T scale, while independent of the 
R scale statistically, is actually su- 
pertor to the R scale in differentiating 
between members of the different 
political parties (including the com- 
munists) (10). It is submitted, 
therefore, that Christie’s statement is 
strictly meaningless. If Christie had 
quoted the relevant statistical find- 
ings, this fact would have become 
apparent immediately. 

We must now turn to the problem 
of sampling. Christie spends a con- 
siderable amount of space in trying 
to show that our middle-class sample 
was ‘“‘completely unrepresentative of 
the British middle class.’’ As the 
writer himself has stressed this point 
several times, Christie's work ap- 
pears to be a task of supererogation. 
As was pointed out in The Psychology 
of Politics (4, p. 127): ‘‘Our interest 
lay not in obtaining a representative 
cross-section of the population but in 
comparing different political groups. 
This can best be done by having the 
groups of equal size, thus reducing 
sampling errors to a minimum. If 
mean values are wanted for the total 
population, then mean values for the 
selected groups can be multiplied by 
the proportions these groups form of 
the total population, thus giving an 
adequate indication of population 
values.”’ Again, Christie appears to 
doubt this statement:—‘‘In view of 
the fact that Eysenck’s basic middle- 
class sample is markedly unrepre- 
sentative of the British middle- 
classes, it would be highly dangerous 
to project their attitudes to obtain 
an estimate of the parent popula- 
tions.” 

It may be tedious to the reader to 
spell out this point in detail because 
of its quite elementary nature, but 
as Christie has devoted so much 
space to it, his misinterpretation 
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requires correction. If we are in- 
terested in the variance contributed 
to a given score by a number of fac- 
tors, such as political party, sex, age, 
and education, then the most efficient 
design for giving us such informa- 
tion is obviously one in which all the 
possible groups into which these four 
methods of classification divide the 
population are represented in equal 
number. A representative sample of 
the population would be relatively 
inefficient, particularly when some of 
the groups (liberals, university-edu- 
cated) comprise only a very small 
portion of the population. Mean 
values from such an analytic sample 
cannot, of course, be taken as repre- 
sentative of the whole population; 
we would require to correct the fig- 
ures obtained for each subgroup by 
taking into account the proportion 
of people in that special group in the 
total population. When this is done 
we obtain an estimate of population 
parameters which is only a little in- 
ferior to one obtained from a random 
sample. Thus, an analytic sample is 
vastly superior to a random sample 
with respect to the analysis of the 
influence of different factors, and is 
very little, if at all, inferior to it with 
respect to obtaining estimates of 
population parameters. As our pur- 
pose was not that of obtaining popu- 
lation parameters but of determining 
the relative influence of the factors 
indicated, Christie’s argument ap- 
pears to be quite irrelevant to the 
facts of the situation. 

It should not be assumed from this, 
however, that our sampling proce- 
dures are not subject to criticisms 
on any point. We know of no com- 
plex study in social psychology which 
has handled this problem with com- 
plete adequacy, and we have through- 
out been aware of certain weaknesses 
in our sampling procedures. The de- 
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tails have always been given in suffi- 
cient detail to enable the reader to 
form his own views as to the degree 
to which our conclusions should be 
modified because of these imperfec- 
tions. In this our writings are in de- 
cided contrast to Christie’s own cri- 
tique. He seems to be quite happy to 
establish a point by referring to work 
carried out by Rokeach in which 
scores are given for groups of stu- 
dents and Vauxhall Motors workers 
without any mention at all of sex 
composition, method of sampling 
used, and so forth. Critics who cavil 
at the relatively full data presented 
in respect to the sampling pro- 
cedures used by the writer might be 
expected to heed their own advice. 
In view of Christie's failure to give 
any details at all, the writer cannot 
take seriously the means presented, 
or the criticisms based on them. 

It is fortinate that quite recently 
it has become possible to carry out a 


large-scale study in France, making 
use of a properly selected representa- 
tive sample of the French middle- 


class population. This study was 
carried out by R. Nigniewitzky (10) 
and gave results which are of consid- 
erable relevance to Christie's re- 
marks. Communists on the new and 
improved form of the T scale were 
found to have a mean score of 10.3; 
fascists to have a mean score of 10.2; 
communist fellow-travelers had a 
mean score of 10.2. The mean score 
of the supporters of all the other main 
French parties was 17.6. Commu- 
nists and fascists again appear as very 
much more tough-minded than the 
democratic parties. 

These results are important for 
several reasons. Christie takes us 
to task for selecting communists and 
fascists who were actively engaged 
in political work, and comparing 
them with people who voted for the 
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main three parties, but were not 
specially active in the political world. 
This, he maintains, introduced a 
sampling bias because differences 
may be due to the factor of being 
politically active rather than to being 
procommunist or profascist. This 
argument is almost impossible to 
disprove because in England mem- 
bers of the communist party and 
communist adherents _ generally 
are all characterized by this strong 
degree of political activation; it 
would be practically impossible to 
find communists and fascists not 
active in this way, and if any 
could be found they would be ex- 
tremely atypical. Conversely, the 
typical conservative, liberal, or so- 
cialist voter or party member, how- 
ever strong his convictions, does not 
indulge in the same kinds of activities 
as does the communist or fascist. It 
would, therefore, be not just difficult 
but impossible to find conservatives, 
liberals, or socialists carrying out, 
with equal intensity, the kinds of 
things done by communists and fas- 
cists, and again, if such people could 
be found they would be extremely 
atypical. Christie argues ‘‘In short, 
to what extent are differences in at- 
titudes between communists and 
major party members traceable to 
ideology per se and to what extent 
to other factors relating to political 
activity?"’ It would, indeed, be in- 
teresting to know the answer to this 
question, but only someone excep- 
tionally ignorant of conditions in 
Britain at the moment would expect 
it to be possible to find the answer 
in this country. 

There are other difficulties which 
make any ordinary kind of sampling 
procedure inapplicable in England. 
The number of fascists and com- 
munist party members in the whole 
country is usually considered to be 
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less than 100,000; thus, it would take 
a sample of 300-400 people to find a 
single communist or fascist. To get 
even the relatively small number of 
86 communists and fascists which 
formed our sample, it would require 
a random sample of some 25,000 peo- 
ple. When to this is added the secre- 
tiveness of fascist party members, 
who usually refuse to answer ques- 
tions, and the contempt of commu- 
nists for this type of work, and their 
consequent aversion to taking part in 
it, the impossibility of using orthodox 
methods should be even more obvi- 
ous. As if all this were not enough, 
there is in addition the difficulty that 
if one were not to make party mem- 
bership the criterion for acceptance 
of a person as being a communist or 
fascist, one would be left with no cri- 
terion at all. In the case of the major 
parties, identification was based on 
voting behavior. This is not applica- 
ble to the fascists as there were no 
fascist candidates during the elec- 
tion, and it is hardly applicable to 
the communists because communist 
candidates were standing only in a 
very small number of highly atypi- 
cal constituencies. Christie condemns 
our method of sampling; he does not 
indicate how it could have been im- 
proved—even without taking into 
account the limitations imposed by a 
budget which never rose above, and 
frequently fell short of, the sum of 
100 dollars per annum. 

It is here that our French study is 
so important. In France, the com- 
munist party is a mass party, with 
sufficient members of a nonactive 
character to make it comparable to 
other parties, and to make possible 
orthodox methods of sampling. When 
this is done, as has been pointed out 
above, the result shows even more 
striking differences in the predicted 
direction than were found in this 
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country. Thus, an improvement in 
sampling procedures, as demanded by 
Christie, and an improvement in the 
scale used do not result, as would be 
predicted from his criticisms, in a 
lessening of the observed differences 
between communists and the ortho- 
dox political parties; quite on the 
contrary, the differences become much 
wider and mich more significant. 
Christie might well reply that his 
criticisms were concerned with the 
studies reported in 7he Psychology of 
Politics, and that this new study is 
irrelevant. This, however, is not so. 
In all experiments which involve 
sampling, the investigator has to 
make certain decisions as to which 
factors are, and which are not, likely 
to influence the results, and in need 
of experimental control. Similarly, 
the reader has to decide to what ex- 
tent he is willing to accept the inves- 
tigators’ judgment and to what ex- 
tent he is prepared to reject it. Even 
the best stratified sampling pro- 
cedure involves a decision as to the 
relevant variables which are to be 
used for the stratification. There are 
grounds here for legitimate disagree- 
ments. No random sampling pro- 
cedure fails to encounter the problem 
of nonresponders; no method of han- 
dling this is beyond criticism. In 
studies like the ones reported in The 
Psychology of Politics, where random 
and stratified sampling could not be 
used in the orthodox manner, deci- 
sions have to be made by the investi- 
gator with which the reader may dis- 
agree legitimately. Only additional 
investigations can settle issues which 
otherwise must remain a matter of 
opinion. In the writer’s view, the 
sampling methods used in The Psy- 
chology of Politics, while far from per- 
fect, have adequately substantiated 
the hypothesis under investigation. 
According to Christie they have not. 
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The only way of deciding is not by 
rather pointless argument, but by 
further experiment.” It is the writer's 
view that the Nigniewitzky (10) ex- 
periment has settled the issue as far 
as the sampling controversy is con- 
cerned. 

A good deal of Christie's argument 
isconcerned with findings from Ameri- 
can studies, which he believes con- 
tradict our own findings. He appears 
to believe that relations! ips between 
social attitudes and personality fac- 
tors depend upon the social setting. 
“Any attempt to relate personality 

2 One of the criticisms made by Christie may 
serve as an example of the kind of point on 
which legitimate disagreements might arise. 
The writer, having found that certain con- 
were uncorrelated with T in 
his middle-class sample, did not consider it 


necessary 


trols, such as age, 


to impose these controls on his 
working-class sample as this would have made 
the investigation very much more expensive 
and cumbersome. Christie argues that while 
controls were irrelevant in the middle-class 
sample, there is no proof that they were ir- 
relevant in the working-class sample, and that 
consequently the controls should have been 
retained. This is a possible point of view. It 
certainly would be more satisfactory if all 
possible sources of variation could be con- 
trolled in experiments of this kind. As this is 
impossible, judgments have to be made as to 
the relative importance of different aspects of 
the investigation. In the absence of any evi- 
dence to the contrary, it seemed unlikely to 
the writer that correlations between T on the 
one hand and age, etc., on the other would be 
so very dissimilar in a working-class group as 
compared with a niit#dle-class group. Christie 
quotes some evidence to show that relation- 
ships between attitudinal variables are differ- 
ent in middle- and working-class samples, but 
that, of course, is quite a different point; we 
are here concerned with correlations between 
factor scores and control variables. It may be 
said, in parentheses, that in recent unpub- 
lished work we have found relationships be- 
tween T and the various control variables to 
be very much the same in working-class as in 
middle-class samples. This does not, of course, 
invalidate the principle of Christie's criticism; 
it merely illustrates that a criticism may be 
abstractly legitimate without being neces- 
sarily damaging to the conclusion arrived at. 
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variables to political ideology with- 
out taking the social context into ac- 
count is apt to be highly misleading 
as well as an oversimplification of 
some highly complex interrelation- 
ships.”’ The reader might not guess 
it from Christie's comments, but this 
is almost precisely what the writer 
himself has pointed out in his book 
This is what he has to say. After 
pointing out that most of the work 
contained in The Psychology of Poli- 
tics was carried out in England, he 
goes on to say that “results from 
Germany and Sweden, as well as 
from the U.S.A., make it seem likely 
that the main conclusions drawn here 
would apply equally well there; it 
would not be wise, however, to gen- 
eralize too far. . This is particu- 
larly important when considering the 
personality structure of members of 
groups such as the fascist and com- 
munist parties. In our culture, these 
are minority groups; it is unlikely 
that conclusions based on members 
of such groups could be transferred 
without change to members of the 
Communist Party in the U.S.S.R., or 
to members of the former N.S.D.A.P. 
in Germany. When we talk about 
communists and fascists, therefore, it 
is about British communists and 


fascists we are talking, not about their 
Joreign 


prototypes. At times the 
reader will undoubtedly be tempted 
to generalize beyond this restriction; 


if he does, he does so at his own peril” 


(italics not in original). Many of 
Christie's arguments and criticisms 
are based on assumed similarities be- 
tween English and American condi- 
tions. He is free to indulge in these 
speculative exercises, but the writer 
should make it clear that they have 
little relevance to his own writings or 
views. Attempts have been made to 
extend our work to other countries 
like Spain (11), France (10), Sweden 
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(7), Germany (3), the Near East (8), Christie delights. The reader of these 
and so forth; the accumulation of detailed reports may form his own 
facts would appear more important views regarding the degree of cul- 
than the armchair theorizing in which — tural dependence of R and T. 
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Eysenck’s reply (11) to a method- 
ological critique (4) of his writings on 
personality and politics appears, at 
best, to be lacking in candidness. A 
number of specific criticisms were 
made of his work. He does not refer 
to many of these. Others he at- 
tempts to evade by distorting the 
original. criticism and giving irrele- 
vant answers. This is a serious ac- 
cusation. It may be best evaluated 
by summarizing the original criti- 
cisms and then considering his re- 
sponses, if any, to them. 

One initial comment should be 

Eysenck says that the criti- 
failed to ‘“*...deal with the 
logical development of the theories 
and experiments outlined [in 7Jhe 
Psychology of Politics (9)}...”’ (11, 
p. 431). The reason for not taking 


made. 
cism 


Eysenck’s theories seriously is sim- 


ple. Their basis is essentially an in- 
ductive one.? They primarily rest 
upon data collected by Eysenck and 
his students. Other material which 
lends support is cited; that which is 
contradictory is slighted or ignored 
(3). Errors in the collection, process- 
ing, or analysis of data on the part of 
Eysenck and his collaborators are 
therefore extremely relevant for the 
validity of his theories. 

Although the temptation to rise to 
some of Eysenck’s more irrelevant 
remarks is tantalizing, scientific criti- 
cism is best served by returning the 
argument to the level of fact. The 


1 The title of this paper is based, appropri- 
ately enough, upon that of a book by H. J. 
Evsenck (8). 

2 This is not intended as a critical remark. 
The writer is favorably disposed toward a 
truly inductive approach at the present state 
of the development of social psychology. 


procedure followed in the critique 
of systematically evaluating method- 
ological flaws in Eysenck’s work will 
be followed for the sake of simplicity 
and comprehensiveness. 


THE T SCALE AND ‘‘TOUGH- 
MINDEDNESS”’ 


Sampling. Ananalysisof Eysenck’s 
samples of middle-class supporters of 
various political parties was made. 
It was concluded that they were non- 
representative, as evidenced by gross 
discrepancies in age and education 
between them and estimates of the 
British middle class based upon Brit- 
ish census data. Eysenck agrees with 
this conclusion but regards a syste- 
matic attempt at indicating the ex- 
tent of the bias as a ‘‘task of super- 
erogation” (11, p. 434). 

Eysenck was not criticized for us- 
ing available samples or for advocat- 
ing properly conducted analytic sam- 
pling. He was criticized for main- 
taining that scores of, e.g., univer- 
sity-educated, older, middle-class, 
male Liberals, as sampled by him, 
could be projected to the parent pop- 
ulation of individuals meeting these 
criteria in Great Britain. 

It was pointed out that Eysenck’s 
students tended to collect question- 
naires from individuals who were pre- 
sumably most like them, i.e., young 
and highly educated. At best, one of 
twenty among the British middle- 
class population have had a smatter- 
ing of university education, well over 
half of Eysenck’s sample have so ben- 
efitted (4, p. 415). Half of his sample 
were under 30 years of age as con- 
trasted with but a fifth of the British 
adult population (4, p. 414). 
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Furthermore, Eysenck’s students 
gave questionnaires to friends or 
acquaintances. At best, the parent 
population of his samples can be de- 
fined as consisting of only those indi- 
viduals who were known by students 
of Eysenck’s. Strictly speaking, sta- 
tistical generalizations cannot be 
made to even this highly restricted 
parent population since there is no 
evidence that there was random selec- 
tion of respondents within this pool 
of potential subjects. Lindquist (14, 
pp. 73-74) has a clear discussion of 
the pitfalls involved in generaliza- 
tions based upon nonrandomly drawn 
samples. 

There is therefore no justification 
for Eysenck’s suggestions (5, p. 57, 
9, p. 127) that the test scores of his 
admittedly nonrepresentative sam- 
ples can be projected to obtain a 
meaningful estimate of scores of the 
British population. 

No evidence whatsoever was given 
as to how Eysenck selected working- 
class respondents of the major par- 
ties. The inference was drawn that 
these were also collected by students 
from among their acquaintances. 
This is not denied by Eysenck. 
Among the working-class respondents 
known to Eysenck’s students, ques- 
tionnaires were given to 27 Liberals 
(9, p. 137). These were selected 
solely upon the basis of being known 
by Eysenck’s students at London 
University. If anyone maintains that 
the scores of these 27 individuals can 
be meaningfully projected to all 
working-class members of the Liberal 
Party in Great Britain he may be 
even more “‘exceptionally ignorant of 
conditions in Britain”’ than this critic. 

It was also pointed out that Ey- 
senck’s communist sample was se- 
lected from an active political or- 
ganization. They, by definition, were 
active group members and in this 
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sense differed from the majority of 
the population. The question was 
then asked as to “... whether they 
[Communists] are less different from 
those who are politically active in 
major parties than from those who 
merely list themselves in a particular 
way when asked to do so” (4, p. 
413). The question raised was: are 
the T-scale scores of communists 
(by definition, politically active) 
more similar to those of active mem- 
bers of major political parties than 
to those of zmactive members of major 
political parties? Eysenck does not 
address himself to this question. In- 
stead he repeats the point made in 
criticism that there are no inactive 
communists and then concludes that 
it follows that the comparison is im- 
possible !8 

Finally, Eysenck’s description of 
Coulter’s sample of working-class 
males in the British Army as being a 
‘random sample of the British work- 
ing-class’’ was questioned. The 
absurdity of this statement was 
pointed out. The test scores made 
by these men were compared with 
Eysenck’s other sample of British 
working-class males and significant 
differences between the mean scores 
of the two groups on the R (radical- 
ism) scale were found. Neither sam- 
ple was representative. Differences 
in test scores indicated they were not 
drawn from the same parent popula- 


3 Melvin compared the ‘‘tender-minded- 
ness’’ scores of members of his sample who 
listed themselves as ‘‘active in politics’ (no 
definition of activity other than respondents’ 
self-classification) with those sample members 
who did not so list themselves (15, Fig. 22, 
following p. 329). No differences emerged. 
Such a finding is directly relevant to the origi- 
nal criticism. If Eysenck had cited it, the bur- 
den of rejoining (that such a criterion is too 
vague) would have fallen upon the critic. 
Instead of citing relevant evidence, however, 
Eysenck perverted the logic of the argument, 
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tion. Eysenck does not produce any 
new evidence to rebut the analysis; 
indeed, he does not mention this 
aspect of the criticism. 

Eysenck has not chosen to give a 
rebuttal to any of the specific criti- 
cisms made of either his generaliza- 
tions which were based upon unrep- 
resentative samples or of his compari- 
sous of samples which differed in the 
way they were drawn from the pre- 
sumed parent population. He at- 
tempts to evade the issue by imply- 
ing that criticism was directed solely 
toward his sampling procedures rather 
than the generalizations based upon 
them. He “We 
complex study in social psychology 
which handled this problen 
[sampling] with complete  ade- 
quacy...” (11, p. 434). Although 
the writer disagrees, the point is 
irrelevant.‘ What is relevant is the 


KNOW oO! no 


says, 


has 


degree of caution with which gen- 
eralizations are made from samples 


which 
trom 


cannot be randomly drawn 
the parent population. Al- 
mond’s study of communist defectors 
(2) is an example of scientific §re- 
straint in such a situation. 
Measurement. Attention was di- 
rected toward the bias caused by 
Eysenck's treatment of the 
answer’ category. It was pointed 
out that its always being scored as 
‘“‘tough-minded”’ led to the seemingly 
paradoxical situation where a person 
who had no opinion on 
emerged as the epitome of “tough- 
mindedness” 


‘‘no- 


anything 


who 
disagreed with everything was a para- 
gon of “tender-mindedness.’’ Eysenck 
does not choose to give any rationale 
for the unique procedures which 
could lead to such nonsensical results. 


whereas someone 


4 Among the studies cited in the references 
listed in the critique, that of Stouffer (18) 
handles the problem of sampling in exemplary 
fashion. 
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It was also pointed out that the 
specific comparisons between mem- 
bers of the three major political par- 
ties and the two deviant groups 
(communists and fascists) were af- 
fected by this scoring procedure. It 
was inferred from bits of data pre- 
sented in Eysenck’s writings that 
members of the major parties were 
characterized by a higher frequency 
of “‘no-answer’’ responses than mem- 
bers of the two deviant parties. As 
Eysenck correctly notes in his rebut- 
tal, this loaded the dice against his 
“hypothesis.” The amount of error 
introduced by this peculiarity of the 
scoring system is so much less than 
that arising from some of the mis- 
takes in addition (16) and highly 
aberrant methods of analysis (4) that 
it does materially strengthen 
Eysenck’s position. This is the sole 
comment Eysenck makes about his 
strange treatment of the ‘‘no-answer” 
category, and does not clarify the 
basic issues involved. 

In examining the asymmetric dis- 
tribution of items in the four quad- 
rants of Eysenck’s two-factor space 
it was pointed out the peculiarities 
of the scoring system were such that 
a hypothetically consistent commu- 
nist would automatically be 
‘“‘tough-minded” by one point than 
a hypothetically consistent fascist. 
Eysenck’s reply is a model of dis- 
ingenuousness: ‘‘As we have through- 
out found communists to be slightly 
less tough-minded than fascists, 
Christie's argument would suggest 
that, in fact, we should increase the 
communists’ scores by one _ point, 
thus making them even more like the 
fascists than appears in our results”’ 
(11, p. 433). How the criticism could 
possibly suggest to Eysenck that in- 
creasing communists’ scores by one 
point could equate for the inade- 
quacies of the scoring system is most 


not 


less 
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puzzling. The T scale is so scored 
that the higher the score, the greater 
the purported ‘‘tender-mindedness.” 
Adding a point to the scores of com- 
munists would therefore have the 
opposite effect from that postulated 
by Eysenck—zt would increase the dif- 
ference between communists’ and fas- 
cists’ scores on the T scale! Aside 
from this non sequitur, Eysenck’s 
finding that communists are more 
“tender-minded” (by from roughly 
one to two points depending on whose 
addition is used) is partially due to 
the biases resulting from his scoring 
system. Actually, since his com- 
munist sample did not respond to 
T-scale items as Eysenck hypothe- 
sized they should, the bias leads to 
an unknown error in the scoring sys- 
tem. 

The criticism relating to the asym- 
metric distribution of 14 items in 
four quadrants is not rebutted by 
Eysenck’s discussion of asymmetrical 


two-item scale. This has nothing to 


5 In addition to attempting to evade criti- 
cism by a superficial example, it is important 
to note that Eysenck uses “imaginary” values 
(11, p. 432) for the two items in his example. 
He says, ‘‘The first item relating, say, to trial 
marriage has a loading of +.6 on R and +.5 
on T; the other relating, say, to the death 
penalty has a loading of —.6 on Rand +.5 on 
T” (11, p. 432). The items referred to are 
Nos. 29 and 36 of his version of the T scale 
(9, Table XVIII, p. 123). The actual loadings 
of these items according to Eysenck’'s re- 
ported findings are —.53 and +.56, —.60 and 
—.20 respectively (9, Table XX, p. 129). Both 
items are loaded on the radical side of R; the 
first is “tough-minded"’ and the second 
“‘tender-minded.”’ Eysenck goes on to say, “A 
person saying ‘Yes’ to both questions would 
therefore get a score on the T scale of 2, a 
person answering ‘No’ to both questions 
would get a score of zero’’ (11, p. 432). Ac- 
cording to Eysenck’s scoring system, however, 
they would both get scores of one on the T scale 
but for different reasons. A person accepting 
both would be given a point for his affirmative 
response to Item No. 36; a person rejecting 
both would be given a point for not accepting 
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do with the problem of asymmetry 
which exists in both Eysenck’s and 
Melvin’s scales. It may be that this 
is one of the points that Eysenck 
claims to have answered in his reply 
to Rokeach and Hanley (11, Foot- 
note 1, p. 431). There Eysenck says, 
“The T score combines in equal pro- 
portions radical and conservative 
items and thus gets rid of the compli- 
cation introduced by the R_ fac- 
tor...’ (10, p. 180). 

This statement is irrelevant. An 
examination of Fig. 1 in the critique 
(4, p. 419) indicates that seven T-scale 
items are saturated with radicalism 
and seven are loaded on the conserva- 
tive end of the axis. This does not 
get rid of complications because of 
the asymmetry of the dispersal of 
the items in the quadrants which was 
the point made in criticism. 

Eysenck nowhere in his writings 
discusses the rationale for having an 
asymmetric distribution of the T- 
scale items (five, four, three, and 
two) in the four quadrants of his fac- 
tor space. Let us accept for a mo- 
ment his ground rules for the distribu- 
tion of items—namely that there 


Item No. 29. 

Further, “A person high on R would say 
‘Yes’ to the first and ‘No’ to the second item; 
a person low on R would reverse this" (11, p. 
432), indicates Eysenck’s confusion about 
items in his own scale and their scoring. A 
person high on R should accept both state- 
ments and one low on R should reject them 
if scored according to his procedure. 

It is suggested that Eysenck is forced to 
resort to the use of “imaginary"’ values be- 
cause there are not, in fact, items with em- 
pirical loadings on T and R which are so 
balanced that the variances will be canceled 
out as in his example. The actual loadings on 
T and R are extremely crucial to the problem 
of asymmetry of item distribution. Eysenck’s 
“imaginary” loadings are not in agreement 
with his findings but are given arbitrary values 
which support the argument which he ad- 
vances in attempting to evade criticism. 
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should be a balance between radical 
and conservative items in the T 
scale. Eysenck objects to ‘‘specula- 
tive exercises’’ based upon his data 
and for very good reasonis as shall be 
demonstrated. The exercises to be 
presented are designed simply to 
show the absurdities arising from 
taking Eysenck’s approach seriously. 

First, let us assume that a com- 
munist came across Eysenck’s ma- 
terial and accepted his criteria for 
balancing items in the T scale. With 
relatively little study this person 
could deduce that the responses to 
the items in the various quadrants of 
Eysenck's factor space show fairly 
stable patterns of the degree of ac- 
ceptance by members of Eysenck’'s 
samples of adherents of various politi- 
cal parties. If, for some reason, this 
communist wished to 
that middle-class comrades (as 
sampled by Eysenck) were really 
not ‘‘tough-minded’’—indeed, that 
they were more ‘‘tender-minded” 


demonstrate 
his 


than Eysenck’s sample of conserva- 


tives—he could do so very simply by 
deleting one of the four items in the 
radical ‘‘tough-minded”’ quadrant 
and adding one in the radical ‘‘tender- 
minded" quadrant. Evysenck's scor- 
ing system does not make commu- 
nists ‘‘tender-minded”’ when they ac- 
cept radical “ttough-minded”’ items 
(as is their wont). It does make them 
“‘tender-minded” when they accept 
the “tender-minded” radical items 
(as is their wont). Such a substitu- 
tion of items, which is completely in 
accord with Eysenck’s specifications, 
serves to make the communists more 
“tender-minded.”’ It also makes the 
conservative sample somewhat less 
“tender-minded.”’ Indeed, as shown 
in Table 1, the conservatives are now 
more ‘‘tough-minded”™ than the com- 
munists! 

Let us speculate even further. As- 
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sume a conservative wishes to 
‘prove’ that communists are even 
more ‘‘tough-minded’’ (especially 
when compared to conservatives) 
than Eysenck’s figures indicate and 
that he has caught on to the fine art 
of juggling items. Staying within the 
ground rules, he then deletes three 
items from the conservative ‘“tough- 
minded” quadrant (to which some 
of Eysenck’s conservative sample 
were receptive) and adds three items 
to the conservative ‘“‘tender-minded” 
quadrant (which conservatives, ap- 
propriately enough, accept). The re- 
sults of such a manipulation make 
the conservatives even less ‘‘tough- 
minded” than the communists than 
is done by Eysenck’s own procedure 
as indicated in Table 1. 

The preceding comparisons could 
have been made even more grotesque 
if the and substitution of 
items had been based upon the ac- 
tual responses to individual items as 
reported by Eysenck (7, Table III, 
p. 200) rather than by manipulating 
the means of items in the quadrants. 

The point made by these ‘‘specula- 
tive exercises’’ is simple. The re- 
quirements for the T scale which 
Eysenck has stipulated are meaning- 
less. His original scale was so con- 
structed and scored that the means 
reported for those affiliated with vari- 
ous political parties (even when his 
arithmetic is corrected insofar as 
possible) represents the peculiarities 
of the biases built into it rather than 
the positions of respondents along 
any meaningful measure of “tough- 
mindedness.’”6 


deletion 


6 In many ways it would have been more 
logical to have the T-scale items equally dis- 
ributed among the four quadrants. If this 
had been done, certain interesting conse- 
quences for Eysenck'’s speculations would 
follow. Computations along the line indi- 
cated in Table 1 indicate that in agreement 
with Eysenck’s and Rokeach and Hanley’s 








<a) 
Lo | 
i 
a 
~ 
me 
x 
~ 
Q 
me 
<= 
Lae] 
~ 
1S) 
SN 
me 


afar se SaoURIdS 
-U9},, JO BURY 
wy Aq UaAId sieyMouU si sasuodsed , 


, WU | pepurm-YyAno},, 
pey se s3 198 
, JO UOTTIOC 





+60 08 I 
89°C $70 


*suo-) “wwo’) *suo-) “wwWo’) 


stud} ] 
VIOIG OF jO"ON 


uolnqgti4quo) 


B1IOIG 0} 
uonnqtijuo7y 


Surssn[ 
$ AAI} PAJOSUO-) peonoyyodA zy 


Surpssnf 
s stunWIWOT) [RONAYy Od APY 


———. sw} 


jo "ON 


wATY AQ APRUL BI UOTIL 
{} Surppury jo poyjeuw 


*suoy “uWO) 


IIOIG 0} 
uolnnqiuquoy 


aInpa01g 
S$, youass4y 








«eda GNIJ-HONO],, AYO[ SHAILVAMASNOD) YO SLSINANWOD SSV1D-ATAdIY aAYy 


T A1TdVvl 


,pepullu-lep 
$ Ja@AGU SEY oy pue 
01d OU L * 





8JOOS a] PIS- | 


[ropey 
cs) Al }PAIOSUO™) 
Pt IUl-lOpUuday ,, 


Ol 9AT}PAIOSUOT) 
to pelipey 
,.pepurtu-ysnoT,, 


‘suo “WWOD 
suld}] 


jo ‘ON 


_ jueipengy 
sjuPIpen() ul 
Suld}] jo 
aourjdao0y 
aseUdIIIg ULI] 





-ONITOON[ WAL] NI 


,ASIONAXY AALLWINOAS,, Y 








SOME ABUSES OF PSYCHOLOGY 


It would have been even more ap- 
propriate to the criticism to examine 
the effects of Eysenck’s artifacts on 
the scores of fascists as compared 
with communists and major party 
members. Data on fascists are not 
available. However, it is clear that 
present demonstrations of 
with which Eysenck’'s scoring system 
can be manipulated te produce con- 
tradictory results by a simple shift- 


the ease 


ing of scale items is a sufficient rea- 
son for viewing his conclusions as 
being partially the result of illogical 
and unjustified procedures in scoring 
items. 

An analysis of the responses of 
communists to specific T-scale items 
in the four quadrants indicated that 
they accepted or rejected items along 
a radicalism and conservatism di- 
If an item had a saturation 
on the radical dimension communists 
accepted it whether it was ‘“tough-”’ 
or ‘‘tender-minded.”’ The 
indicated that no prediction could be 
whether communists 


mension. 


analysis 


made as to 


ists sam- 


middle-class 


iM Commutr 
J 
senckK are more 


( omputatio 


pled by Ey 


than middle-class members of major political 


tough-minded” 


rary to findings using Eysenck’s 
drants, 


would be 


parties Cont 
distribution of items in qu however, 


working-cla Socialists more 
“*tough-minded” than working-class Commu- 
nists. 

Of even greater interest 
great stress which Eysenck's theorizing (9, pp 
259-260) places upon his finding that in every 
political party samples are 
more “tough-minded” than are middle-class 
samples. If the items had been symmetrically 
distributed it would have been found that 
working-class Communists and Conservatives 
han middle-class 


however, is the 


working-class 


were less ‘‘tough-minded”’ 
followers of these parties. The relative posi- 
tion on the T scale of working- and middle- 
class samples of Liberal and Socialist parties 
would remain the same. This is but one exam- 
ple of the effect which artifacts in measure- 
ment have upon Eysenck’s theory building. 
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would accept T-scale items if only 
their saturation on T was known but 
that in every case in which knowledge 
of their saturation on R was known 
a perfect prediction could be made as 
to whether communists as a group 
would accept or reject the item. This 
is the basis for making the statement 
that the T scale ‘‘does net apply” to 
a certain group, in this case, members 
of the Communist Party. It was, 
perhaps erroneously, believed that 
the statement which Eysenck “does 
not understand at all” (11, p. 433) 
was clear in its original context. Evi- 
dently it was not and it shall be 
spelled out. Eysenck's own data un- 
equivocally indicate that communists 
respond to T-scale items according to 
their saturation on R and not to their 
saturation on T. It was therefore ar- 
gued that the T scale, as scored by Ey- 
senck, does not measure “‘tough-mind- 
edness’ among the members of the Come 
munist Party whom he sampled. This 
is the basic point and Eysenck’s dis- 
cussion of the statistical reliability of 
the T scale in Coulter's sample of 
communists (11, p. 433) evades the 

The issue is validity and *not 
liability. Is it too much to assume 
that Eysenck knows that it is possi- 
ble to have a reliable’scale which has 
no. validity? 

Analysis. In his rejoinder, Eysenck 
does not touch upon the critical com- 
ments made about his use of ‘“‘average 
of average’ scores rather than using 
conventional statistical techniques. 
He does not justify his arbitrary 
lumping together of fascists and com- 
munists when this violates the data. 
It is therefore not clear why he 
chooses as part of the title of his re- 
joinder, . the Personality Simi- 
larities Between Fascists and Com- 
munists.”’ His data do not suggest 
similarities but rather dissimilarities. 


issue. 


re 
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THE F SCALE AND ‘‘AUTHORI- 
TARIANISM”’ 


Eysenck’s statement that the F scale 
measured authoritarianism rather 
than potential fascism was examined. 
It was concluded that such a state- 
ment was completely unjustifiable 
in terms of other research and that 
Eysenck’s own data indicated the 
opposite. The only possible basis for 
such an assertion was the work of 
Coulter in which an alleged politi- 
cally ‘neutral’ group made an un- 
precedentedly low score on the F 
scale. Evidence was presented which 
clearly indicated that this particular 
sample was a highly aberrant one 
when their other test scores, as re- 
ported by Eysenck, were evaluated. 

Eysenck’s only reply to this por- 
tion of the critique is his refusal 
to acknowledge the relevance of data 
collected in Great Britain by Rokeach. 
He is correct in objecting to a refer- 
ence to material which had (at the 
time the critique was written) been 
submitted but not accepted for pub- 
lication. Since Rokeach’s monograph 
is now in press, the reader will be able 
to evaluate Rokeach’s finding that 
the British college students sampled 
by Rokeach have lower F-scale scores 
than do his sample of British workers 
(17). 

The reasons for rejecting Eysenck’s 


TABLE 2 


F-ScaLE SCORES OF STUDENTS IDENTIFYING 
WITH VARIOUS POLITICAL PARTIES* 





Item-mean 
Score 
3.98 
3.39 
3.51 
3.12 
2.86 


Party 
Identification 





Conservative 
Liberal 

Labor (Atleeites) 
Labor (Bevanites) 
Communist 





* Item means computed from (17, Table 13, p. 34). 
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argument that communists score high 
on the F scale were indicated in the 
critique. Since other of Rokeach’s 
data bear directly upon this point, 
they are worthy of reproduction. 
Rokeach gave the F scale (1, pp. 
255-257) to students at London Uni- 
versity. The respondents were asked 
to indicate their political preferences. 
Table 2 indicates the results. 

Completely contrary to Eysenck’s 
statement about communist scores on 
the F scale, the communistically in- 
clined students at the University of 
London, where Eysenck teaches, 
scored lowest on the F scale. This 
finding is in complete accord with 
some twenty years of research on 
similar types of scales as was indi- 
cated in the critique. 

It is once more concluded that 
Eysenck’s attempted equation of F- 
scale scores and authoritarianism 
based upon Coulter’s samples is com- 
pletely out of line with all available 


data including his own. 


FURTHER REMARKS 


The previous discussion has dealt 
with the specific points made in the 
critique and Eysenck’s failure to 
answer them adequately. In evading 
these criticisms, Eysenck raises many 
points which require clarification. 
Two will be singled out for comment; 
his discussion of ‘‘pure’’ T-scale items 
and his use of research other than his 
own to support his speculations. 

“Pure’ T-scale items. An impor- 
tant aspect of Eysenck’s theorizing is 
that, “The T-factoritself .. . israther 
the projection on to the social atti- 
tude field of a set of personality vari- 
ables” (9, p. 170). As noted in the 
critique, this conclusion of Eysenck’s 
might well have resulted from the 
fact that his procedure of collecting 
items from existing scales could not 
have possibly uncovered ‘‘pure’’ meas- 
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ures of T since such were not included 
in the scales subjected to pruning. 
It was suggested that, “If, instead 
of taking items which had been of 
relevance in previous research, he 
had analyzed the definition of tough- 
mindedness and then selected, in- 
vented, or modified items which ap- 
peared relevant and then factor an- 
alyzed responses to them and other 
items he might well have isolated a 
much purer dimension of ‘tough- 
tender-mindedness’”’ (4, p. 427-428). 

Eysenck apparently agrees with 
this criticism of his original pro- 
cedure since he quotes the statement 
and then says, “Having attempted 
for many years to do what Christie 
advocates, and having had several 
students make similar attempts, all 
without success, the writer believes 
that Christie is somewhat optimistic’’ 
(11, p. 432). 

An examination of Eysenck’s pub- 
lished work does not indicate a single 
instance of his ever having offered an 
analysis of ‘‘tough-mindedness.”’ In 
his earliest work in this area the items 


were described as forming a “‘practi- 
cal-theoretical’’ dichotomy and Ey- 
senck later noted that the interpreta- 
tion of the factor was entirely subjec- 
tive (9, p. 119). It was in the report 
of work done upon the basic middle- 


class sample (5) that the terms 
“tough-" and ‘‘tender-minded” were 
first applied to the factor which ap- 
peared to remain after the radical- 
conservative axis was extracted. No 
formal definition was given but com- 
parisons between the implicit content 
of the items and some of William 
James’s comments about the tender- 
minded and the tough-minded have 
been made by Eysenck (9, p. 131). 
Such an identification is extremely 
tenuous. If we were to take Eysenck’'s 
usage of James seriously we would 
have to conclude that communists 
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and fascists were not dogmatic (since 
dogmatism is a Jamesian tender- 
minded trait and according to Ey- 
senck communists and fascists are 
“‘tough-minded’’)! 

No evidence is presented in sup- 
port of Eysenck’s contention, nor 
does he indicate which of his students 
have attempted to undertake an an- 
alysis of the term ‘‘tough-minded- 
Melvin, whose scales of R 
and T have been cited by Eysenck as 
being “improved” versions (9, p. 
132), is the only student known to 
have done any other work on the 
construction of the T scale.’ This is 
his description of his approach: 


ness.”’ 


The most logical way to begin this search 
would be to make a theoretical analysis of the 
concept of tendermindedness and then make 
formal deductions to a series of hypotheses 
about its verbal manifestations. This proce- 
dure was considered, but it soon became clear 
that it was difficult to arrive at any conclu- 
sions about the essential psychological nature 
of T by pure thought alone, and that a strictly 
formal approach would have to be abandoned 
(15, p. 122, italics in original). 


Melvin's basic procedure’ was 
Eysenckian. He examined attitude 
scales published since 1947 in search 
of items (Eysenck had gone over 
earlier scales). In addition, however, 
he gathered items from the expressed 
opinions of minority group members 
and publications, from a political en- 
cyclopedia, and other new items were 
originated upon the basis of discus- 
sions with Eysenck. His pool of 239 
items was an adequate sampling of 
existing scales and also contained 
original material in contradistinction 
to Eysenck’s original 40 items (4, p. 
427). 


7 Melvin’s thesis (15) was mentioned in the 
critique as being unknown at the University 
of London Library. It has been filed since the 
critique was submitted for publication (per- 
sonal communication from the Univer. of 
London Library, Aug., 1955). 
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It is therefore not surprising that 
such an essentially empirical ap- 
proach, although conducted with 
methodological sophistication, led to 
the discovery of relatively few items 
which even hinted at the existence 
of ‘‘pure’’ T items. Seven items were 
found which had negligible loadings 
on R (.08 or less) and modest lead- 
ings on T (.20 or more) (15, Ap- 
pendices A—D, pp. i-xxviil). 

In his concluding chapter, Melvin 
says, ‘‘The difficulty noted... of 
obtaining valid Tendermindedness 
scores for toughminded-radicals raises 
another urgent problem. This might 
well be approached along similar 
lines to those adopted by the authors 
of The Authoritarian Personality... 
in their development of the California 
F-scale”’ (15, p. 344). 

The inferences to be drawn from 
Melvin’s work are: (a) his approach 
was essentially empirical and did not 
follow from any rigorous analysis of 
the definition of ‘‘tough-mindedness,”’ 
and (b) he apparently believes that 
valid T-scale items might be obtained 
by a more theoretically oriented ap- 
proach. 

Eysenck’s claim that he and his 
students have been attempting for 
years to do research based upon an 
analysis of the definition of T is not 
supported by any known published 
or unpublished material. Indeed, the 
most recent and most relevant thesis 
—done under Eysenck’s own super- 
vision—suggests that such an ap- 
proach be tried! 

It is a moot point whether or not 
‘‘pure’’ T items could be uncovered. 
To repeat the point made in the 
critique, the procedures used by 
Eysenck and his students could not 
possibly have uncovered many of 
them—if they did exist—because no 
formal attempt was made to define 
what they were looking for and the 
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selection of items was limited pri- 
marily to those used by other investi- 
gators. 

Eysenck's use of supporting re- 
search. In his replies to Rokeach and 
Hanley and to this critic, Eysenck 
has not chosen to answer specific 
criticisms about his methodology. In- 
stead he has preferred to rely upon 
references to unpublished theses of 
his students. 

There are three studies done by 
Eysenck and his students which have 
contained comparisons between com- 
munist, fascist, and other political 
party samples’ scores upon the T 
scale. The first of these was the 1951 
study of Eysenck (7). Flaws in this 
study were pointed out and these 
were not directly answered in Ey- 
senck’s reply. 

The second was an unpublished 
thesis by Coulter which Eysenck 
referred to in his reply to Rokeach 
and Hanley as supporting his posi- 
tion. This critic raised questions 
about Coulter's thesis which, if cor- 
rect, invalidated it as a meaningful 
comparison of the ‘‘tough-minded- 
ness’ of communists, fascists, and 
major party members. Eysenck does 
not mention, let alone attempt to 
answer these criticisms in his reply. 
He also does not again cite this thesis 
in support of his position in his reply. 

The third study was an unpub- 
lished thesis by Nigniewitzky. In 
his reply to this critic, Evsenck 
places primary reliance upon it and 
anticipates that it might be consid- 
ered irrelevant to criticisms based 
upon earlier studies (11, p. 435). 
Such an anticipation is correct. Even 
if Nigniewitzky’s data could with- 
stand critical methodological scrutiny 
there would be no justification for the - 
many errors made by Eysenck. If 
Nigniewitzky’s results are correctly 
reported by Eysenck, the latter is 
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placed in the position of having blun- 
dered onto the truth despite a con- 
catenation of critical mistakes. 

The temporary unavailability of 
Nigniewitzky’s thesis* makes it im- 
possible to determine whether it is 
relevant to the following statement 
by Eysenck, ‘‘Thus, an improvement 
in sampling procedures, as demanded 
[sic] by Christie, and an improve- 
ment in the scale used do not result, 
as would be predicted from his crit- 
icisms, in a lessening of the observed 


differences between communists and 


the orthodox political parties . 
(11, p. 436, his italics). 

It is unclear from Eysenck’s state- 
ments what sort of a sample 
utilized by Nigniewitzkv. According 
to Eysenck, when replying to Rokeach 
and Hanley, it was “ 


Was 


a properly 
stratified sample of the French popu- 
(10, p. 178). According 
to Eysenck, when replying to this 
writer it was“ . aproperly selected 
representative sample of the French 


lation ‘a “ 


8 The only possible way to get copies of 
theses from the University of London Library 
is to request that a microfilm copy be pre- 
pared. The person requesting such a copy 
must pay for it and sign a statement that it 
will not be quoted without written permission 
This procedure was followed in 
Coulter's thesis ar 


of the author 
the case of d it required 
some five months from the irfitial inquiry until 
the microfilm copy was received 

In view of the short period of time available 
1 to Ey ( 
1 


ble to wait for a microfilm copy of Nignie- 


ora reply nck it did not appear feasi 
witzky's thesis nor was it certain that quota- 
tions would be permitted Following Ev- 
senck’s remarks in Footnote 2 of his reply 
(11, p. 437), the writer requested copies of 
both Melvin's and Nigniewitzkv's theses. A 
copy of Melvin’s thesis was graciously for- 
warded. Eysenck said that a copy of Nignie 
witzkv's thesis had not vet been received from 
the U1 Library \t the 


same time 1956) Nigniewitzkyv was 


iversity of London 
March 7, 
also sent a letter requestil g a copy ot his the- 
sis. No.reply has been received as of the time 
this article was submitted for publication 
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middle-class population” (11, p. 435, 
italics added). 

It is also unclear trom what Ey- 
senck says what improvements in the 
T scale were made and what rele- 
vance these might have to the criti- 
cisms made of the earlier version. 
The only “improved” versions of the 
T scale mentioned by Eysenck in 
other contexts are those by Melvin. 
The problem of unequal distribution 
of T-scale items in the four quadrants 
also exists in Melvin’s two scales 
since they both had ten items in each 
of the two ‘‘tough-minded” quad- 
rants and six in each of the two 
‘‘tender-minded” quadrants (15, Ap- 
pendices A—D, pp. i-xxvili). This dis- 
tribution, combined with either of 
the two scoring systems discussed by 
Melvin would not alleviate the 
sources of bias discussed in compar- 
ing communists with members of 
major political parties on the T 
scale. Unlike Eysenck, Melvin recog- 
nizes this problem and discusses in 
his thesis the then unsolved problem 
of communists responding to T-scale 
items in terms of their loading on R 
rather than T (15, pp. 219-225). 

Aside from a lack of satisfactory 
detail, Eysenck’s remarks about the 
greater differences found between 
communists and major party mem- 
bers in France than in England 
evades the issue and is completely 
irrelevant. Gross methodological 
Eysenck’s and Coulter's 

British party members 
made their comparisons meaningless. 
It therefore follows that no predic- 
tion whatsoever can be made as to 
whether or not a new study (using 
proper sampling and measurement 
procedures) would show an increase 
or lessening in relative differences of 
parties along the T scale when com- 
pared with their results. 


errors in 
studies of 


It would be unjust to prejudge 
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Nigniewitzky's thesis upon the basis 
of Eysenck’s ambiguous remarks. It 
would be unwise to accept the lat- 
ter’'s statements about it at face 
value, however, since Eysenck 
“|... cannot agree with any of the 
major criticisms put forward...” 
(11, p. 431) of his own work and 
nowhere indicates an awareness of 
the implications of his many method- 
ological excesses. 

Eysenck concludes his flight from 
criticism by inviting the reader to ex- 
amine five studies using the T scale 
carried out in countries other than 
Great Britain. The suggestion is 
irrelevant far as criticism of 
Eysenck’s procedures is concerned. 
The prospective reader should be re- 
minded that with the presumed ex- 
ception of Nigniewitzky’s study, all 
of them utilized Eysenck’s original T 
scale. Their interpretation is there- 
fore subject to all the cautions neces- 
sitated in evaluating results obtained 
with this unique “‘measurement”’ in- 
strument. 

It is contended that Eysenck’s ex- 
tensive citation of research other than 
his own fails to answer the methodo- 
logical criticisms raised about his 
own work. In those instances where 
such research has been available for 
examination, it does not support 
Eysenck but confirms the criticisms 
made or is irrelevant to them. 


as 


CONCLUSION 


commu- 


Eysenck contends that 
nists and fascists are more “tough- 


‘ 


minded” and “authoritarian” than 
are members of major political par- 
ties. This plausible assumption turns 

® For an amusing earlier demonstration of 


the same point see the comments by Greenall 
(12) and Eysenck’s reply (6). 
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out, upon critical inspection, to be 
based upon errors of computation, 
uniquely biased samples which forbid 
any generalizations, scales with built- 
in biases which do not measure what 
they purport to measure, unexplained 
inconsistencies within the data, mis- 
interpretations and _ contradictions 
of the relevant research of others, 
and unjustifiable manipulations of 
the data. Any one of Eysenck's many 
errors is sufficient to raise serious 
questions about the validity of his 
conclusions. Jn toto, absurdity is 
compounded upon absurdity, so that 
where, if anywhere, the truth lies is 
impossible to determine. 

It had been hoped that Eysenck's 
reply to specific criticisms would be 
directed toward acknowledging their 
relevance or rebutting them. If this 
had been done our exchange would 
have served to clarify problems and 
sharpen legitimate points of differ- 
ence. Instead, Eysenck does not rebut 
a single specific criticism. 

Eysenck’'s responses to these criti- 
cal points which he takes note of in- 
variably evade the specific issue. Re. 
liance is placed upon an extensive ci- 
tation of the research of others- 
Those that are available do not sup- 
port his position but indicate the 
cogency of the criticism. 

This critic rests his case. It is be- 
lieved that the detailed and perhaps 
tedious documentation of Eysenck’'s 
scientific sins of omission and com- 
mission is sufficient to raise grave 
doubts about the validity of his 
conclusions. The reader is invited to 
decide for himself whether or not 
Eysenck’s many methodological er- 
rors and his evasions of specific criti- 
cisms constitute abuses of psychol- 
ogy. 
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THE QUANTITATIVE STUDY OF SHAPE AND 
PATTERN PERCEPTION! 


FRED ATTNEAVE anp MALCOLM D. ARNOULT 
Skill Components Research Laboratory, Air Force Personnel and Training Research Center 


The pre-eminent importance of 
formal or relational factors in per- 
ception has been abundantly demon- 
strated during some forty years of 
gestalt psychology. It seems extra- 
ordinary, therefore, that so little 
progress has been made (and, indeed, 
that so little effort has been ex- 
pended) toward the systematizing 
and quantifying of such factors. Our 
most precise knowledge of perception 
is in those areas which have yielded 
to psychophysical analysis (e.g., the 
perception of size, color, and pitch), 
but there is virtually no psycho- 
physics of shape or pattern. 

Several difficulties may be pointed 
out at once: (a) Shape is a multidi- 
mensional variable, though it is often 


carelessly referred to as a ‘‘dimen- 
sion,” along with brightness, hue, 


area, and the like. (6) The number 
of dimensions necessary to describe 
a shape is not fixed or constant, but 
increases with the complexity of the 
shape. (c) Even if we know how 
many dimensions are necessary in a 
given case, the choice of particular 
descriptive terms (i.e., of reference- 
axes in the multidimensional space 
with which we are dealing) remains 
a problem; presumably some such 
terms have more psychological mean- 
ingfulness than others. 


1 This research was carried out at the Skill 
Components Research Laboratory, Air Force 
Personnel and Training Research Center, 
Lackland Air Force Base, San Antonio, 
Texas, in support of Project 7706, Task 27001. 
Permission is granted for reproduction, trans- 
lation, publication, use, and disposal in whole 
or in part by or for the United States Govern- 
ment. 
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The need for an adequate psycho- 
physical framework is most obvious 
in those studies (having to do with 
discrimination, for example, or with 
positive or negative transfer) in 
which it is necessary to manipulate 
shape or pattern as an independent 
variable. Unless some meaningful 
units of variation are specifiable, 
functional relationships cannot be 
obtained. It is somewhat less obvi- 
ous, but nonetheless true, that a com- 
parable need exists in experiments 
which seek to determine how form 
perception is influenced by -extrinsic 
variables such as_ size, contrast, 
method and degree of familiarization, 
etc. In studies of this sort, the ex- 
perimentercommonly usessome small, 
arbitrarily chosen set of stimuli: 
sometimes simple geometrical forms; 
sometimes a group of ‘‘nonsense”’ 
shapes which he draws in a more or 
less haphazard manner. If the results 
obtained are ‘‘significant’’ in the 
usual sense, we have some specifiable 
degree of confidence that they are gen- 
eralizable to people other than those 
used as subjects, but the degree to 
which they are generalizable to new 
stimuli remains a matter of conjec- 
ture. Yet the latter kind of generaliz- 
ation is no less important than the 
former. Only in rare cases of applied 
research is the investigator really 
content with results which hold only 
for the particular stimulus objects 
employed experimentally. 

Egon Brunswik (9, 10, 11) is per- 
haps the only psychologist who has 
ever given due weight to the im- 
portance of stimulus-sampling, or of 
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situation-sampling in general.  Al- 
though the approach of this paper 
is somewhat different from Bruns- 
wik’s, for reasons which are devel- 
oped below, we wish to acknowledge 
freely Brunswik’s influence upon our 
own thinking, and to commend his 
writings on this subject to any reader 
unacquainted with them. Brunswik 
takes the reasonable position that re- 
sults with ‘‘ecological validity’? may 
be obtained only by the use of experi- 
mental materials which are drawn 
from, and hence representative of, 
the real situations to which one 
wishes to generalize. Thus, in the 
study of shape perception, it would 
be desirable to experiment with the 
shapes of natural objects. Suppose, 
however, that we wish to investigate 
the learning and memory of shapes 
with which subjects are initially “un- 
familiar: the requirement of unfamili- 
arity will obviously preclude the ex- 
perimental use of shapes which are 
commonly encountered. Is there any 
sensible procedure for choosing stim- 
ulus-materials in this sort of situa- 
tion? 

It is our belief, at this time, that 
the problem of generalizing from ex- 
perimental stimuli may profitably 
be broken into two parts. First, 
there is the problem of specifying the 
stimulus-domain, i.e., the problem 
of drawing a sample of stimuli from 
a parent population characterized by 
certain determinate statistical pa- 
rameters. The stimulus-domain, or 
parent population, includes all those 
stimuli to which the results may be 
generalized, and is defined by the sta- 
tistical parameters which characterize 
it. In the following section we shall 
indicate a variety of particular meth- 
ods for drawing ‘‘random”’ patterns 
and shapes from such clearly defined 
hypothetical populations, to which 
experimental results may then be gen- 


eralized with measurable confidence. 

The second problem, which is 
really a special case of the first, is 
that of drawing a sample which has 
“ecological validity.”’ If our real aim 
is to generalize to natural forms, or 
to some subset thereof, it is neces- 
sary to estimate the psychologically 
important statistical parameters of 
these natural forms in order that ex- 
perimental materials may be con- 
structed to possess the same parame- 
ters. Thus, we are brought back to 
the acute need for a general psycho- 
physics of form. In the final section 
we shall discuss the kinds of physical 
analysis and measurement which ap- 
pear appropriate to such a psycho- 
physics. 


THE CONSTRUCTION OF STIMULI 


All the methods described below 
for constructing nonsense shapes and 
patterns have in common the fact 
that the particular characteristics of 
each figure are randomly determined. 
Each method is, in effect, a set of 
rules by which points are plotted and 
connected in accordance with values 
obtained from a table of random 
numbers. Each method, or set of 
rules, thus determines a domain of 
stimuli. The stimuli actually con- 
structed for use in a given experiment 
will, if they are all constructed ac- 
cording to the same rules, be a ran- 
dom sample of the stimulus-domain 
defined by the set of rules. The ex- 
perimental results, consequently, may 
be generalized both to the entire 
stimulus-domain and to the appropri- 
ate subject population.’ 


2 The kind of double-generalization pro- 
posed here would require an error term which 
included the variance due to subjects, the 
variance due to stimuli, and the interaction 
between them. In what is perhaps the most 
obvious analysis-of-variance design, the sub- 


jects X stimuli Xtreatments mean square would 


be the appropriate error term to use. 
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The experimenter who desires to 
use stimuli constructed in this man- 
ner must determine what set of rules 
will provide him with a stimulus 
population having the character- 
istics he wants. If one desires to gen- 
eralize experimental results to the 
world of real objects .(chairs, air- 
planes, people, etc.), it is necessary 
to have a stimulus sample possessing 
ecological validity. To construct 
nonsense stimuli of this sort one 
must know the pertinent parameters 
of the stimulus-domain of real ob- 
jects and use these parameters in 
constructing the experimental stim- 
uli. In the next section we shall dis- 
cuss some of the problems inherent 
in this methodological requirement 
and some of the attempts which have 
been made to solve them. 

In the present section, some gen- 
eral methods for constructing stimuli 
are described in sufficient detail that 
the reader, if he desires, may repeat 
the operations in order to develop 
additional stimuli belonging to the 
various stimulus-domains defined by 
the methods. It should be kept in 
mind, however, that these methods 
are described merely as examples 
and are not intended to constitute a 
comprehensive ‘catalog of all possible 
methods. Descriptions will be given 
of methods for generating shapes 
having either closed or open contours, 
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for generating various kinds of pat- 
terns, and for introducing systematic 
variations or transformations of 
shapes or patterns. 


Closed Contours—Angular Shapes 


Method 1. Starting with a sheet of 
graph paper—say 100 X 100—succes- 
sive pairs of numbers between 1 and 
100 are selected from a table of ran- 
dom numbers. Each pair will deter- 
mine a point which can be plotted on 
the 100 X 100 matrix. The total num- 
ber of such points to be plotted can 
be determined either randomly or 
arbitrarily. 

When all the points have been 
plotted, a straightedge is used to 
connect the most peripheral points 
in such a way as to form a polygon 
having only convex angles. This 
operation will usually leave some un- 
connected points within the polygon 
(Fig. 1a). When a point falls within 
some small, arbitrarily chosen dis- 
tance of the proper perimeter (e.g., 
the point between segments 7 and 8 
in Fig. 1a) it is included even though 
it makes a slightly concave angle, 
since otherwise an indentation prac- 
tically dividing the shape into two 
parts might later occur. The sides of 
the polygon are numbered, and the 
points remaining inside are assigned 
letters. The table of random numbers 
is then used to determine which of the 
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Fic. 1. SUCCESSIVE STAGES IN THE CONSTRUCTION OF A “RANDOM” FIGURE 
AccorDING TO MEtTuHop 1 (SEE TEXT) 
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central points is connected to which 
side. In the example given, Point C 
was connected to Side 2, forming in 
the process Side 10 (Fig. 10). At this 
stage in the construction, the possi- 
bilities of connecting points have 
been changed. Point A may now be 
taken into Sides 3, 4, 5, 6, 7, 8, or 10, 
but not into Sides 1, 2, or 9. Point B 
may be connected only to Side 2 or 
Side 10. If Point A is connected to 
Side 5, forming new Side 11, there 
remains only the possibility of con- 
necting Point B to Side 2 or Side 10 
(see Fig. 16). Connecting Point B 
to Side 10 completes the shape, which 
finally appears as shown in Fig. 1c. 
It will be noted that every step 
in the procedure is determined either 
randomly or by the elimination of all 
other possibilities. Furthermore, 
every step is completely determinate 
and can be duplicated by anyone us- 
ing the same rules and the same selec- 
tions from the table of random num- 
bers. 
Method 2. 
structing 


This method of con- 
random shapes is also 
started by plotting successive pairs 
of random numbers as coordinates on 
graph paper. As each point is plotted 
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it is given a number so that even- 
tually all are numbered serially. 
These points are then connected in 
the order in which their serial num- 
bers first appear in a table of random 
numbers, except that numbers which 
violate certain rules of construction 
are rejected. The incomplete con- 
struction shown in Fig. 2a will pro- 
vide examples of permitted and non- 
permitted connections. The rules for 
connecting points are as follows: 

a. No line may be drawn twice. 
Assume, in Fig. 2a, that the last line 
drawn was from Point 2 to Point 5. 
If the next number in the table were 
2, it would be rejected since that con- 
nection has already been made. 

b. No line may be drawn which 
completely encloses a point within 
the perimeter of the figure. From 
Point 5 it would not be permissible to 
draw a line to Point 6 or to Point 4, 
since either action would completely 
enclose Points 3 and 8. 

c. No two points may be directly 
connected if they are already con- 
nected by a path which follows per- 
imeter lines without passing through 
any other plotted points. For ex- 
ample, Point 5 may not be connected 
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Fic. 2. EXAMPLE OF NONSENSE SHAPE CONSTRUCTED BY THE RULES OF METHOD 2: 


INCOMPLETE CONSTRUCTION DEMONSTRATING 


PERMISSIBLE AND NONPERMISSIBLE 


Coxnections; 6. THE COMPLETED SHAPE 
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to Points 3 or 7, Point 3 may not be 
connected to Points 5 or 6, and Point 
2 may not be connected to Point 4. 

d. The figure is complete when 
each point has been connected to at 
least two other points. It sometimes 
happens that the table of random 
numbers leads one to a point which 
already has all the other connections 
allowed it. In this case one of the 
other points is chosen randomly as a 
new origin and the regular process is 
continued. The incomplete shape of 
Fig. 2a is shown in a completed form 
in Fig. 20. 

As is the case with all the methods 
described in this paper, this method 
is completely objective. The result- 
ing figure could be reproduced, if 
necessary, from a set of coded instruc- 
tions consisting only of the numbers 
originally selected from the table. 

Unlike Method 1, Method 2 usu- 
ally generates shapes containing some 
angles in addition to all those at 
originally plotted points. This dif- 
ference is emphasized by Rule c of 
Method 2. 

In Method 1 there are no restric- 
tions on the ways in which the plotted 
points may be connected except that 
(a) the figure must be closed, and (0) 
connecting lines may not cross, i.e., 
the completed figure may have angles 
only at the original points. 

In Method 2, on the other hand, 
there may be “emergent’’ angles at 
places other than originally plotted 
points, and the figures produced tend 
to be characterized by ‘good continu- 
ation.”’ Again, it is Rule c of Method 
2 which causes many of the perimeter 
lines of the final figure to be continua- 
tions of other perimeter lines. 

Comparing the two methods in 
terms of the informational content 
of the shapes produced shows that 
in Method 1 information (in addition 
to that required to locate the original 
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points) is used only in connecting the 
interior points to the sides of the 
original perimeter, whereas in Method 
2 information is used in making all 
connections between plotted points. 
For this reason a Method 2 shape 
composed of m original points and 
containing +k angles (k represent- 
ing the number of “emergent” points) 
will contain more information than a 
Method 1 shape composed of n origi- 
nal (and final) points. Because of the 
good continuation introduced into 
the figure, however, the Method 2 
shape having n+ points will contain 
less information than would a Method 
1 shape having n+ original points. 

Method 3. Fitts, Weinstein, Rap- 
paport, Anderson, and Leonard (15) 
have developed a technique for con- 
structing ‘‘metric’’ figures, the in- 
formational content of which may be 
easily and accurately determined. 
Starting with a somewhat smaller 
matrix—say, 8X8—the number of 
cells to be filled (from the bottom up) 
in each column of the matrix is ran- 
domly determined. This method pro- 
duces shapes which belong to a rela- 
tively small stimulus-domain and 
which are equal in informational con- 
tent. A variation of this method in- 
volves allowing each possible column- 
height to appear only once in each 
shape, with the order of appearance 
determined randomly. This second 
stimulus-domain ca members 
which are equal in area and, conse- 
quently, contain less information 
than the shapes first described. Still 
another variation may be introduced 
by reflecting each shape on one of its 
axes to produce a symmetrical shape 
containing no more information than 
its nonsymmetrical predecessor. Ex- 
amples of these various classes of 
metric figures may be found in Refer- 
ence 15. 


contains 
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Fic. 3. 


SHAPE. THE ORIGINAL SHAPI 


Closed Contours—Curved Shapes 
Method 4. This method describes 
a procedure for making wholly or 
partially curved from the 
angular shapes constructed’ by 
Method 1 or 2. This procedure may 
appear to be somewhat involved, but 
actually it requires more time to de- 
scribe than to perform. 


shapes 


Essentially, 
it consists merely of replacing angles 
with inscribed curvature 
chosen randomly within limits im- 
posed by the figure. 

For purposes of demonstrating the 


arcs, of 





METHOD FOR INTRODUCING “RANDOM" CURVES INTO AN ANGULAR NONSENSE 
IS THE SAME ONE WHICH APPEARED IN FIG. 1¢ 


method, let us start with the shape 


described and constructed under 
Method 1 (Figs. la—1c). It is de- 
cided (arbitrarily or randomly) that 
four of the twelve angles are to be 
curved. Let us suppose that Angles 
C, F, J, and K (Fig. 3) are chosen. 
(For convenience of exposition the 
angles have been assigned the letters 
A through L.) The first step in the 
process consists in constructing line 
Cp, which is the bisector of Z BCD. 
Then, the shorter of the two arms of 
the angle (in this case, line BC) is 





458 


divided into equal units. These units 
may be chosen for convenience. For 
example, Fig. 3 was constructed on a 
100 X 100 matrix having matrix units 
equal to 0.20 in., and Line BC was 
arbitrarily divided into segments of 
0.25 in. each. It should be noted that 
the divisions of the line are num- 
bered in sequence, starting always 
from the apex of the angle. 

One of these numbered points on 
line BC is now chosen at random and 
a perpendicular from Line Cp to it 
is constructed (Line 5-g). This line 
(5-g) now becomes the radius of an 
arc which is inscribed within Z BCD. 
The are is tangent to Line BC and 
Line CD at points equidistant from 
C. Thus, ZBCD has now been re- 
placed by a curve (actually, two lin- 
ear segments and an arc) going from 
B to D. 

Point F has been curved by the 
same process. Angle E FG is bisected 
by line Fr, and Line FG is divided 
into equal segments. Division 8 hav- 
ing been chosen at random, line 8-s 
is constructed and used as a radius 
for inscribing a curve within Z EFG. 

The next two constructions dem- 
onstrate the complex curvature which 
may result when successive points 
are chosen to be curved. Point J 
is curved by the process described 
above, with Line 13-u being used as 
the radius of an arc inscribed within 
ZIJK. However, in curving Point 
K it is necessary to inscribe an arc 
within ZJ’KL, not within ZJKL. 
Point J’ is the point at which the arc 
constructed with radius 13-u_ be- 
comes tangent to line JK. 

If it is so desired, all the points of 
an angular figure may be curved. It 
should be noted, however, that the 
shorter arm of every angle is divided 
into segments, and that its divisions 
are numbered beginning with zero. If 
the zero is the random choice, the re- 
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sulting curve will have zero radius, 
i.e., that angle remains as originally 
drawn. 

Method 5. Angular shapes can be 
changed into curved shapes by a pro- 
cess of photographic blurring. The 
figure is first photographed and then, 
with the help of an enlarger, is 
printed out-of-focus on high contrast 
paper. The resulting image has a con- 
tour which is curved, but which is 
also graded in density. A repetition 
of the process of photographing and 
printing, however, will eliminate the 
density gradient, producing a shape 
with contours which are rounded and 
well-defined. The amount of blur 
may, of course, be carefully con- 
trolled, and a graded series of curved 
shapes may be made from a single 
prototype shape. 


Open Contours. 


Method 6. There are many ways in 
which open-contour nonsense shapes 
may be constructed from a table of 
random numbers; but all that we 
have used have been variations on 
one basis method. Starting from the 
approximate center of a matrix of 
convenient size, a line is drawn to one 
of the eight intersections nearest the 
starting point. These eight intersec- 
tions (or, more generally, directions) 
have been assigned numbers as shown 
in Fig. 4a. The intersection on the 
graph paper at which the first line 
terminates becomes the origin for 
the second line to be drawn, and so 
on. A difficulty with this method is 
that there is no intrinsic criterion for 
completeness in such a figure. One 
objective rule is to determine, before 
beginning the construction, the total 
number of digits to be selected from 
the table and to consider the figure 
complete when that number of lines 
has been drawn. 

Many variations on this basic tech- 
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(b) 


Fic. 4. CONSTRUCTION OF OpEN-CONTOUR 
“RANDOM” SHAPE: a NUMBERING OF POSSIBLE 
INTERSECTIONS; 6. TypicaL NONSENSE SHAPE 


nique may be introduced. For ex- 
ample, for some purposes it may be 
desired to allow only four directions 
in which the contour may vary; also, 
the length of each line may be deter- 
mined randomly as well as the direc- 
tion. Partially or wholly curved con- 
tours may be produced by this 
method as follows: the radius of cur- 
vature of the arc drawn to connect 
successive intersections along the 
horizontal and vertical axes of the 
matrix is set as one-half the length 
of a matrix unit. To connect two 
intersections diagonally separated, 
the arc would have a radius equal to 
one matrix unit. Thus, for example, 
one might determine randomly for 
each line constructed: (a) which two 
intersections will be connected, (0d) 
whether the connection is to be linear 
or curved, and (c) the direction of 
curvature. Figure 45 was drawn by 
this technique. Additional variations 
on these methods may be provided 
by using semi-log, log-log, or polar 
coordinate matrices on which to con- 
struct the nonsense contours. 


Patterns 


Method 7. Although the more obvi- 
ous ways of generating rand6m pat- 
terns have been used by a number of 
investigators, the possibilities of this 
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approach to the construction of com- 
plex visual displays have never been 
adequately explored. In general the 
practice has been to construct a 
matrix of some given size and then to 
determine randomly which cells are 
to be filled. Patterns of dots were 
constructed in this fashion by Kauf- 
mann, et al., (19), French (16), and 
Klemmer and Frick (20), for exam- 
ple. Attneave used the same ap- 
proach, including the introduction of 
a symmetry factor, in a study of the 
effect of redundancy on memory for 
patterns (4). In another slight varia- 
tion Arnoult used random shapes as 
elements in constructing random pat- 
terns for use in a learning experiment 
(2). Patterns generated in this fash- 
ion are very attractive as stimuli be- 
cause it is usually possible to com- 
pute fairly precisely the imforma- 
tional content of the display. 


Systematic Variations 


Frequently it is desired to con- 
struct ‘‘families’ of shapes having 
known physical relationships among 
the individual members. Again, there 
are many possible techniques for ac- 
complishing this end. The following 
two methods represent two kinds of 
systematic variations which have re- 
cently been used. 

Method 8. A prototype shape is 
constructed by any of the methods 
so far described. Then, each point 
is moved to a new location and the 
connecting lines redrawn as before. 
In moving the points, any of the fol- 
lowing parameters may either be 
held constant or varied randomly: 
(a) the number of points moved, (6) 
the particular points moved in mak- 
ing successive variations on the same 
prototype, (c) the distance through 
which a point is moved, and (d) the 
direction of movement. A number of 
variations made from a given proto- 
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type will form a distribution of shapes 
which ‘‘vary about”’ the prototype. 
Stimuli of this sort were used re- 
cently by Attneave in testing the 
hypothesis that knowledge of the 
prototype shape, or ‘‘schema,’’ would 
facilitate discrimination of the varia- 
tions in paired-associate learning (6), 
and by Arnoult in a study of the ef- 
fect of predifferentiation training 
on recognition (1). A typical proto- 
type shape and its variations are 


shown in Fig. 5. 
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Fic. 5. A PRoTrotyPE SHAPE AND “FAMILyY' 
OF RANDOM VARIATIONS 


Method 9. A somewhat different 
technique for creating ‘‘families” of 
shapes has been developed at Stan- 
ford by LaBerge and Lawrence (23). 
Initially, a random shape is con- 
structed by a method essentially 
the same as those described in 
Method 1 and Method 2 (actually, 
LaBerge and Lawrence simply con- 
nected randomly chosen points into 
the polygon of minimal perimeter). 
Then, each point on the contour is 
assigned randomly chosen “x’’ and 
“‘y’’ increments to its coordinates, 
and these new coordinates are plotted 
and connected on a fresh 
These same increments 


matrix. 
are then 
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added to the new coordinates and a 
third figure is constructed. This pro- 
cess may be continued until one has 
constructed a row of, say, six figures, 
each differing from its immediate 
neighbors by a constant amount of 
distortion as measured by the dis- 
tance through which the points move. 
The next step is to label the former 
ss and the 


x'’ increments as ‘“‘ys” 
former ‘‘y’ increments 
These new increments are added to 
the coordinates of the points of all six 
of the figures already constructed, 
and the process of constructing suc- 
cessive shapes is repeated until there 
is a column of six shapes for each of 
the original six shapes. The final re- 
sult is a matrix of 36 shapes in which 
any two adjacent shapes in a row or 
column are equally spaced in terms 
of the average distance the points 
have moved. Matrices of stimuli 
of this sort are currently being used 
by LaBerge and Lawrence in studies 
of transfer. 

As has been emphasized a number 


as "XS. 


of times in the preceding discussion, 
these methods for constructing ‘‘ran- 
dom” shapes are only a few which 
have been selected to show some of 
the classes of shapes which can be 
constructed. The number of differ- 
ent of rules which can be de- 
veloped for plotting and connecting 
points taken from a table of random 
numbers is limited only by the fer- 
tility of the individual experimenter’s 
imagination. It should be reiterated, 
however, that stimuli con- 
structed by these “random” methods 
does not insure that the generaliza- 
tions resulting from the research will 
be pertinent to all other kinds of 
visual stimuli. It guarantees only 
that the results will be generaliz- 
able within a particular stimulus- 
domain, i.e., to any other stimuli con- 
structed by the same rules. 


sets 


using 


“a 
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ANALYSIS OF NATURAL FORMS 


Let us now return to a problem 
which the methods discussed in the 
previous section by no means obvi- 
ate. We still need a technique, or a 
set of techniques, by means of which 
physical measurements of a psycho- 
logically relevant sort may be ob- 
tained for forms which we have not 
constructed ourselves. Any method 
of “random” construction must em- 
ploy some set of rules, either arbi- 
trary or otherwise, and these rules 
will strictly determine the class-char- 
“acteristics, or statistical parameters, 
of the shapes constructed. We 
should like to be able to devise rules 
such that our synthetic shapes might 
possess the statistical characteristics 
(but not the familiarity) of natural 
shapes to which we wish to gen- 
eralize. At present, we lack not only 
a factual knowledge of the values of 
these statistical parameters, but also 
a methodology to guide us in their 
determination. Likewise, when some 
experimental variation of form is 
found to produce a certain effect in 
the laboratory, it is necessary that 
the variable in question be identifia- 
ble and measurable outside the labo- 
ratory if the results are to be gen- 
eralized. Unfortunately, however, it 
is much harder to measure form than 
to manipulate it. 

Relatively few scientists have seri- 
ously applied themselves to the prob- 
lems of analvzing and describing 
form; these problems seem to have 
fallen into the cracks between 


Ssci- 


ences, and no general quantitative 


morphonomy has ever developed. 
D’Arcy Thompson's Growth and Form 
(27) is virtually the only major work 
in the field: it is a fascinating and 
impressive book, but its contribution 
to the identification of psychophysi- 
cal variables is limited. Rashevsky, 
whose work in mathematical bio- 
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physics is in some respects a continu- 
ation of Thompson’s, has been more 
directly concerned with psychologi- 
cally relevant measures of form. 
Abstraction of contour. Considering 
that the first step in the analysis of a 
shape is the abstraction of its con- 
tour, Rashevsky (25, p. 449) devised 
a simple hypothetical nerve-net with 
this function. Suppose that the stim- 
ulation of the retina is projected to 
some central area as an activity of 
sharply localized excitatory fibers 
and of inhibitory fibers slightly more 
diffuse in their projection. If certain 
constants of the system have proper 
values, excitation from any area of 
uniform brightness will be sup- 
pressed, except at a contour where 
such an area is bounded by a darker 
one which provides less inhibition. 
This nerve-net has a fairly close 
analogue in the following photo- 
graphic process. A negative and a 
positive transparency, separated by 
a thin plastic sheet, are precisely 
superimposed so that they “‘cancel’”’ 
each other when viewed from a right 
angle. A print is made by transmit- 
ting light from a diffuse source (e.g., 
the ground glass of a contact printer) 
through the superimposed positive 
and negative to a high-contrast paper 
placed in contact with the negative. 
In the case of a black object on a 
white ground, or vice versa, light 
can angle through both positive and 
negative only at the contour, and 
the resulting print is indistinguish- 
able from an outline drawing of the 
object. In the case of more complex 
pictures, the abstraction of sharp 
brightness-gradients preserves tex- 
ture, as well as contour: this is illus- 
trated clearly in Fig. 6. A picture ob- 
tained in this way may be thought of 
as a differential (with respect to 
brightness) of the original, involving 
a ‘‘delta’’ of finite magnitude. If 
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Fic. 6. A DIFFERENTIAL PICTURE 
The photograph of D'Arcy Thompson from which this was derived is by Bjérn Soldan; it 
appeared originally in Jsis and was reproduced in the August, 1952, Scientific American. In the 
original, the lightest portions are Thompson's forehead and beard, and the darkest portion is 
the back of his coat. These have approximately equal brightness in the differential picture. 
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a smaller ‘‘delta’’ had been taken in 
the derivation of Fig. 6 (by reducing 
the space between the superimposed 
positive and negative), the iris and 
pupil of Thompson’s eye, for ex- 
ample, would appear in outline in- 
stead of as a black dot. 

In 1948 one of the authors (Att- 
neave), in collaboration with John 
M. Stroud, attempted to develop 
this photographic technique to a de- 
gree of precision such that the total 
reflectance of the differential picture 
might serve as an index of the com- 
plexity of the original. That attempt 
was unsuccessful for several reasons, 
having to do chiefly with the unrelia- 
bility of photographic operations: 
e.g., the initial step of making a posi- 
tive and a negative which would ade- 
quately cancel always required con- 
siderable cut-and-try. It may be 
added that the process is a close rela- 
tive of one which has long been used 
to produce a ‘“‘bas-relief’’ effect, and 
that the Eastman Laboratories have 
recently employed a similar tech- 
nique with color film to obtain photo- 
graphs which look remarkably like 
paintings. 

An_ electronic 


device de- 


lately 
veloped by Kovasznay and Joseph 
at the National Bureau of Standards 
appears to accomplish much the same 
result as the photographic process 
described above, but in a manner sub- 


ject to more precise control. The 
beam of a cathode ray tube, moving 
in a complex scan which covers the 
field in two orthogonal dimensions, 
transmits light through a_ photo- 
graphic transparency to a_photo- 
electric cell. The electrical signal thus 
generated isdifferentiated and squared 
electronically, and then fed into a re- 
ceiving scope where it modulates a 
beam synchronized with the trans- 
mitting beam. Illustrations of the 
results, which are presented in the 
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descriptive note of Kovasznay and 
Joseph (21), could be mistaken for 
the efforts of a somewhat naive artist. 

A group of engineers in the Lin- 
coln Laboratory of M.I.T., including 
Oliver G. Selfridge, Gerald P. Din- 
neen, and Marshall Freimer, are cur- 
rently experimenting with the use of 
digital computers to perform opera- 
tions relevant to object identifica- 
tion. They have been si ssful in 
programming a contour-abstracting 
operation; this is preceded by an av- 
eraging operation, which rids the fig- 
ure of irrelevant detail, and followed 
by an operation which abstracts 
angles, or regions of high curvature, 
from the contour (26). 

The mere abstraction of contour, 
whether by an objective process or 
with the aid of the experimenter’s 
own perceptual machinery, does not 
in itself constitute quantification. It 
does, however, contribute to the iso- 
lation of that which is to be quanti- 
fied: i.e., form. Whenever we speak 
of form, we are referring to a some- 
what vague set of properties which 
are invariant under transformations 
of color and brightness, size, place, 
and orientation; our definition may 
or may not be extended to specify 
invariance under projective (or per- 
spective) transformations. Contour 
is characterized by invariance under 
color and brightness transformations. 
Attneave (3) has previously pointed 
out the related (though not equiva- 
lent) fact that contours are regions 
of relatively high informational con- 
tent. 

Analysis of contour. There are vari- 
ous practical reasons for wishing to 
be able to describe a contour in terms 
which are independent of its size, 
place, and orientation. For example, 
subjects are often required to draw 
figures from memory: such drawings 
cannot be fairly evaluated by any 
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simple method of superimposing a 
drawing upon the original and meas- 
uring deviations, because of differ- 
ences in scale, etc. If both the origi- 
nal and the reproduction could be 
represented in terms descriptive of 
form alone, they could then be com- 
pared objectively. 

Such a representation may take the 
form of a single function. If the re- 
ciprocal of the radius of curvature of 
a closed contour is plotted against 
distance along the contour, a peri- 
odic function results. This function 
may be normalized (i.e., rendered in- 
dependent of the scale of the original 
figure) by assigning a value of unity 
to the perimeter of the figure and ex- 
pressing radius of curvature in com- 
parable terms, or by setting equal to 
unity the area under one period of 
the function. An angle is represented 
by a vertical line which rises (or falls, 
in the case of a concave angle) to in- 
finity; a spike of this sort, of infinite 


height, infinitesimal width, and de- 


terminate area, is the so-called 6- 
function of Dirac, and is amenable to 
mathematical treatment. 

If one feels more comfortable deal- 
ing with finite ordinates, the follow- 
ing system may be used. Imagine a 
miniature tricycle, guided over a 
course such that a point midway be- 
tween the rear wheels precisely fol- 
lows the contour. The angle 6 by 
which the front wheel deviates from 
a forward position may be plotted 
against distance travelled by the 
front wheel to give a periodic func- 
tion descriptive of the contour. The 
front wheel will move in an arc con- 
centric with the segment of the con- 
tour being followed. Wherever an 
angle occurs in the contour, the angle 


§ This system of representation has been de- 
veloped in considerable detail by Oliver 
Strauss (personal communication). 


x Tee ad 
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6 of the front wheel will be 90°; thus 
the function will always have some 
value between plus and minus 90°. 
Radius of curvature, 7, is related to 
6 by the equation r=L cot @, in 
which L is the distance between the 
front and rear wheels. Normalizing 
may be accomplished by giving the 
perimeter of the figure unit value, and 
setting Z at some standard fractional 
value. If LZ is made to equal 1/27, 
regular polygons will be represented 
by square waves regularly alternat- 
ing between 0 and 90°, a circle will 
become a horizontal line with an 
ordinate of 45°, and certain other 
regularities will be uniquely repre- 
sented; this value of LZ is somewhat 
large for convenient use with more 
complex shapes, however. The in- 
terested reader will have little diffi- 
culty in working out further details 
of the system. It has the advantage 
of specifying an actual measuring de- 
vice which is practical and simple to 
construct. Automatic recording of 
the function could be arranged with 
two pairs of selsyns: one translating 
the rotation of the front wheel into a 
movement of the recording paper; 
the other coupling the angular posi- 
tion of the front wheel with the posi- 
tion of a recording pen. 

Both of the functions just de- 
scribed have a serious disadvantage. 
Suppose we wish to compare two 
shapes which have a part-to-part or 
part-to-whole similarity—say, the 
outline of a cow’s head with the out- 
line of a whole cow. The normalizing 
factors which will be employed on a 
basis of perimeter or area will obvi- 
ously not be such as to give compara- 
ble representation to the similar por- 
tions of the outlines. 

The method next to be described 
avoids this difficulty, though it is not 
without limitations of its own. In- 
stead of describing the contour by 
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means of a continuous function, we 
may attempt to analyze it into parts 
which are individually homogeneous, 
and hence amenable to approximate 
description in terms of a few stand- 
ardized dimensions. It is usually 
possible to construct a polygon about 
a figure made up of complex lines 
and curves, as in Fig. 7, by drawing 
tangents (a) at points of zero curva- 
ture (e.c., CD, IJ, etc. 
curve changes from con¢ave to con- 
vex, it must have an intermediate 
point of zero curvature), (b) at points 
of minimal curvature, where a de- 
crease in curvature is followed by 
an increase (e.g., FG), and (c) at dis- 
continuities of slope, or angles (e.g., 
AB, GH, etc.). The series of lines 
thus formed may be described simply 
by stating the slope and length of 
each line in succession, but this de- 
scription is peculiar to a given orien- 
tation and size of the figure. It may 
be rendered orientation-free and scale- 
free by specifying instead, for each 


whenever a 


pair of adjacent segments, (a) the 


change in direction (in degrees), and 


(b) the change in the logarithm of 


length, as the contour is followed in a 
clockwise direction.* Curves are 
treated as ‘‘rounded-off’’ angles: i.e., 
a curve is approximated by an arc 
located tangent to two 
lines of the polygon we have been 
discussing. In most cases, the size of 
the arc will be limited by the length 


successive 


4 Several other possible pairs of coordinates 
convey the same information. What is re- 
quired, essentially, is to describe the shapes 
of successive segments of the polygon, taken 
in pairs. Measures of any two angles of such 
a triangle, or any two ratios of sides or differ- 
ences between logarithms of sides, or any 
combination of an angle and a comparison of 
sides, is adequate to specify the shape of the 
triangle. The combination above is chosen 
for its intuitive appeal; also because errors of 
measurement have a more uniform effect on 
these coordinates than on certain others. 
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of the shorter of the two segments. 
Hence curvature is conveniently ex- 
pressed by a third coordinate speci- 
fying (c) the proportion of the distance 
between the apex of the angle and the 
end of the shorter segment at which the 
arc best approximating the curve 1s 
tangent. This coordinate will usually 
have some value between O and 1.0, 
with 0 indicating an abrupt angle 
(radius of curvature equal to zero) 
and 1.0 indicating an arc which is 
the shorter segment 
at its end. In the case of Fig. 7, 
for example, (c) would have a 
value of 0 at A and M, a value of 
1.0 at G, and a value of about .8 at C 


tangent to 


Fic. 7. ILLUSTRATION OF METHOD FOR 
QUANTIZING IRREGULAR CONTOUR 


(note that the arc best approximat- 
ing a curve will not necessarily have 


the same point of tangency as the 
original curve). When the are ap- 
proximating a curve turns through 
more than 180°, as in the case of the 
bulbous projection in the JKL re- 
gion, the value of (c) will not remain 
between 0 and 1, since some of the 
points of tangency are on extensions 
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of segments of the polygon, rather 
than on the segments themselves. 
The values of (c) associated with J, 
K, and L would be about 3.8, 3.6, 
and .5, respectively.’ 

The reader will recognize this sys- 
tem of analysis as essentially the re- 
versal of a method for constructing 
“random” shapes which was de- 


5 The two sets of numbers below, which are 
presented as a demonstration of the practica- 
bility of the system and as an amusement for 
the reader, describe recognizable profiles of 
the two authors. Successive tri-coordinates 
are given, thus: a, bi, C1; @2, b2, c2;etc. In the 
actual reconstruction of a contour from such 
coordinates, a line of any desired length and 
slope is drawn to start. The first triad of 
coordinates gives the relationship of the 
second line of the contour to the arbitrarily 
drawn starting line, and so on. Cumulative 
error will be avoided if the values of a are 
cumulatively added to the slope, in degrees, of 
the starting line (with due regard for the circu- 
larity of the scale) to obtain the slope of each 
segment; likewise if the values of b are cumu- 
latively added to the logyo length of the start- 
ing line to obtain the logio length of each seg- 
ment. Positive values of a denote clockwise 
turns; negative values counterclockwise (con- 
sistent reversal results in a mirror-image). In 
specifying values of (c), the symbol ‘“<"’ is 
used to mean “‘less than .1,”’ i.e., that the 
angle is rounded to a slight, practically un- 
measurable, degree. 

+61, —.56, .3; —84, +.08, <; +27, —.04, 
1; +35, +.66, .2; +155, —.06, 1.0; —145, 
—.39, 0; +32, +.13, 1.0; +27, —.43, <; 
—56, +.61, .1; +107, —.38, .4; —69, —.12, 
1; +37, —.62, <; —88, +.38, 0; +105, 
+.11, 1.0; —79, +.33, .1; +86, +.34, .5; 

; —25, —.76, .4; —66, +.23, 

, 1.0; +30, +.09, 0; +113, 
5 25, A 
, 4; +26, — 

, 4;+36, -. 6; +41, —.41, 
. 45,0; —6, —.18, <; +29, +.03, 
—43, +.26, <; +12, —.18, <; +88, 

‘ 5; +23, —.27, .4; —124, +.31, .5; 
+90, —.12,-.8; —29, —.28, <; +70, —.23, 
1.0; —114, +.39, 0; +66, +.16, .3; —65, 
—.04, .4; +44, —.17, <; —43, +.77, .7; 
+121, —.28, .9; —26, +.04, <<; +105, —.08, 
.7; —62, —.49, .6; +26, +.09, <; —113, 
+.13, .3; —25, +.25, .5; +44, —.26, .8; —83, 
—.24, 0; +19, +.42, 1.0. 


.22, .4; —31, +.02, 
4 ae 


oe 
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scribed in the previous section. The 
system has several advantages: (a) 
It yields a description which does 
not vary with size and orientation. 
(b) Since the use of a general normal- 
izing factor is avoided, part-similari- 
ties between the contours of two ob- 
jects are reflected in their numerical 
descriptions. Likewise, repetitious 
sequences of elements in the same 
contour (but not parallel lines) are 
reflected, and could be quantified by 
an autocorrelational technique. If 
two similar shapes (e.g., an original 
and a subject’s reproduction from 
memory) were compared by cross 
correlation of their numerical de- 
scriptions, it would be desirable to 
calculate values for several ‘‘displace- 
ments”’ of one set of coordinates upon 
the other (as in autocorrelation), in 
order to allow for qualitative omis- 
sions or additions of elements. (c) 
There is reason to believe that the 
number of tricoordinates required to 
describe a shape constitutes a first 
order approximation of its psycho- 
logical complexity (i.e., the number 
of psychologically discrete parts which 
it contains). Fehrer (14) used a simi- 
lar measure (number of internally 
homogeneous lines) on her figures, 
and found that complexity, so meas- 
ured, was closely related to difficulty 
in a reproduction-learning situation. 
Attneave (5) recently confirmed that 
the number of sides in a polygon is 
the primary determinant of its judged 
complexity. A better approximation 
would require some adjustment for 
repetitious sequences of elements, 
mentioned above (see Rashevsky, 
25, p. 486 ff.; also Attneave, 3, 4). 
The major disadvantage of the sys- 
tem is that some figures (spirals are 
an obvious case) do not yield unique 
descriptions. This limitation arises 
inevitably from the approximation 
of all curves with straight lines and 
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arcs, and the ignoring of higher-order 
invariances. It is interesting to specu- 
late that the system might be to some 
degree psychomimetic even in this 
limitation, and that objects for which 
it does not vield unique descriptions 
are less likely to evoke reliable per- 
ceptual responses, with the result that 
they may be perceived as 
phous” or “unstable,” 
cult to remember. 
Measuring operations like the fore- 
going, which involve following about 
a contour, are laborious to accom- 
plish manually. It appears, however, 
that they are quite amenable to 
automation by electronic and me- 
chanical means. For example, an 
electronic contour-follower, described 
by Beurle (7), has already been con- 
structed. A point of light is moved 
rapidly through a very small circle; 
when its path crosses the contour, a 
signal is obtained. The phase rela- 
tionship of this signal to the circular 
movement is used to guide the circle 
along the contour, i.e., to move the 
point of light about the contour in a 
cycloidal path. A record of the move- 
ment of the circle, taken from the 
servo control loop, constitutes a de- 
scription of the shape which may 
further be-transformed and analyzed 
by computer-type circuits. 
Measurement of  gestalt-variables. 
We have been considering analytical 
systems by means of which the for- 
mal properties of contours may be 
described in detail. Also of interest 
is another set of variables which do 
not provide a description from which 
the shape can be reconstructed, but 
which do abstract important prop- 
erties of the shape as a whole. We 
shall refer to these as ‘“‘gestalt-varia- 
bles,” or ‘‘gestalt-measures,’’ even 
when they serve to summarize some 
quantizing or analytical process: e.g., 
the number of sides in a polygon is 


“amor- 
and be diffi- 


” 
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such a variable; so is the number of 
tri-coordinates necessary to describe 
a shape by the system discussed 
above. Likewise, the mean value of 
the c-coordinate imthat system might 
be taken as a crude measure of over- 
all curvedness-vs.-angularity. It 
should be clear that the “statistical 
parameters” of populations of shapes, 
referred to earlier, necessarily pertain 
to distributions of certain gestalt- 
measures. 

The more restricted notion of a 
gestalt as a system in which every 
part is affected by every other part 
has been incorporated by Rashevsky 
(25, p. 451 ff.) into a hypothetical 
nerve-net. Suppose that the contour 
of an object is projected to some sheet 
of neurons in ‘the cortex as an iso- 
morphic excitation (Rashevsky’s 
mechanism for contour-abstraction 
has already been described). Sup- 
pose further a distribution of inhibi- 
tory fibers such that, in the next 
higher projection area, every point 
on the contour (i.e., every excited 
neuron) receives inhibition from ev- 
ery other point in an amount which 
varies as a function (presumably de- 
creasing) of the distance between the 
points. At this level, the various 
neurons to which the contour is pro- 
jected will retain more or less residual 
excitation, depending upon the de- 
gree to which each is isolated from 
the others. A given contour will be 
characterized (though not uniquely) 
by some distribution of residual exci- 
tations which will be invariant with 
respect to its place and orientation 
in the field (but not with respect to 
its size). The integral, or mean, of 
this distribution would constitute a 
measure of the ‘“‘simplicity’’ or com- 
pactness of the figure (provided size 
were held constant, or corrected for); 
e.g., a circle would have the highest 
possible value, since its points are as 
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far from one another as is possible in 
a closed contour, and jagged or sinu- 
ous shapes would have low values 
(see Householder, 18). The neuro- 
logical terms in which this model is 
presented need not be taken too seri- 
ously; Rashevsky’s basic idea might 
equally well be applied to the pro- 
gramming of a man-made computer, 
or to a series of photographic opera- 
tions. 

Deutsch (12) has recently sug- 
gested a model for shape perception 
which is somewhat akin to Rashev- 
sky’s. Since it may be described very 
simply in terms of geometrical con- 
cepts, we shall ignore the neural 
mechanisms which Deutsch proposes 
as its basis. Suppose that a perpen- 
dicular is drawn to a closed contour 
at every point along its length. Each 
such perpendicular will contain a 
segment which lies inside, and is 
bounded by, the contour. The 
lengths of these segments will have 
some distribution which will depend 
upon the shape of the contour; this 
distribution may be rendered size- 
invariant by expressing the length 
of each segment as a proportion of 
the length of the contour. In the case 
of a circle, a square, or any other 
regular polygon with an even number 
of sides, the distribution will consist 
of a single spike, since all the seg- 
ments will be of equal length. 
Deutsch suggests that the most prim- 
itive mechanism of form-discrimina- 
tion may abstract a distribution of 
this sort; at the human or primate 
level it would obviously need supple- 
menting with some finer mechanism, 
perhaps one involving contour-fol- 
lowing. He points out that rats have 
more difficulty discriminating a 
square from a circle than from a tri- 
angle, and predicts further that regu- 
lar polygons with even numbers of 
sides should be more difficult to dis- 
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criminate from one another than 
from odd-sided polygons. 

Merely to order shapes along a 
compactness-dispersion continuum 
requires nothing so elaborate as the 
Rashevsky model outlined above. 
The relationship of the perimeter of a 
shape to its area provides an attrac- 
tively simple means of measuring 
this characteristic. The quotient 
P/A, which has been employed by 
some investigators (8, 17), is unsatis- 
factory from our standpoint because 
it varies with size as well as with 
shape, but either P?/A or P/V/A is 
size-invariant. These ratios may be 
transformed in various ways to suit 
the user’s convenience; e.g., the meas- 
ure 


2Vn A 
P 


D=1- 


expresses dispersion as some number 
between zero and one, assigning zero 
value to the most compact figure pos- 
sible, the circle. Dispersion (as meas- 
ured by any such relationship of 
perimeter to area) is not the same as 
complexity (in the sense of number of 
parts). Although a deeply convoluted 
or jagged figure wili indeed tend to 
have a high dispersion value, so will 
a very thin rectangle or ellipse. 
Bitterman, Krauskopf, and Hoch- 
berg (8, 22) have found that under 
conditions of low illumination or 
short exposure, shapes are perceived 
in much the same way as if they were 
physically diffused, or blurred. These 
experimenters created a physical dif- 
fusion model by cutting filter paper 
into various shapes and impregnat- 
ing it with an inhibitor of bacterial 
growth. This inhibitor was then al- 
lowed to diffuse from the paper into 
bacterial cultures. The shapes which 
most resembled each other after dif- 
fusion were those most often confused 
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under adverse viewing conditions. 
Likewise, identification of impover- 
ished stimuli was most impaired in 
the case of shapes characterized by 
relatively small detail, which would 
be averaged out in a diffusion process. 

These findings are interesting and 
important, but the clumsy and some- 
what bizarre bacterial model does 
not lend itself to quantitative predic- 
tion. There is no apparent reason 
why it might not be replaced with a 
model employing optical blur, in 
which case diffusion would be meas- 
ured by the radius of the blur circle. 
An image may readily be blurred to 
a measurable degree in an ordinary 
photographic enlarger, and then re- 
sharpened by means of high-contrast 
paper or film (cf. Method 5 under 
“The Construction of Stimuli’). This 
resharpening process introduces an- 
other parameter, that of the black- 
white threshold to be used in print- 
ing. It is easiest photographically 
simply to employ long exposure and 
development, with the result that a 
white-on-black figure will diffuse 
outward into the field to the full ex- 
tent of the radius of the blur circle. 
If it is desired that concavities and 
convexities be affected symmetri- 
cally, however (note that a psycho- 
logical question requiring an empiri- 
cal answer is thus raised), it is neces- 
sary to resharpen the image into 
black and white about some inter- 
mediate gray such that a linear con- 
tour between black and white fields 


will be restored to its original posi- 


tion. This may be accomplished 
with the aid of a suitable test-figure. 


* Dinneen (13) has succeeded in program- 
ming a digital computer to perform averaging- 
and-resharpening operations of almost ex- 
actly this sort. His paper, which contains 
copious illustrations of the effect of varying 
resharpening threshold, is recommended to 
the reader who finds the above discussion in- 
sufficiently informative. 
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Over a wide range of values on the 
resharpening threshold parameter, 
the process of blurring and resharpen- 
ing will decrease the dispersion 
(P?/A, or D) of any shape except a 
circle, which is already the most com- 
pact shape possible. For any such 
value, dispersion will tend to decrease 
as amount of blur increases, but the 
form of this function—which we shall 
call a blur-response function—will 
vary with the shape involved and 
will describe certain important char- 
acteristics of the shape. Since the 
decrease in the function is associated 
with the ‘‘washing out”’ of progres- 
sively larger detail as the blur circle 
increases in size, any sharp drop indi- 
cates that the shape contains con- 
siderable detail of a magnitude indi- 
cated by the blur circle at that point. 
The blur-response function (or, per- 
haps better, its derivative) is thus a 
potential aid in the statistical evalua- 
tion of ‘‘magnitude of critical detail,”’ 
which Bitterman, et a/., found to be 
of primary importance in determin- 
ing the identifiability of an impover- 
ished shape (8). A full exploration of 
the properties of such functions (par- 
ticularly in the case of shapes char- 
acterized by certain types of regu- 
larity, or redundancy) is beyond the 
scope of this paper; our purpose here 
is merely to suggest their feasibility 
and possible usefulness. One further 
point should be made, however; 
neither the blur-response function 
nor any other gestalt measure can 
possibly predict the relative identifi- 
ability of shapes except in a limited, 
statistical way. The kinds and de- 
grees of similarity which an impov- 
erished shape bears to all the other 
shapes with which it might be con- 
fused will clearly affect the difficulty 
with which it is identified (quite 
apart from any intrinsic properties 
it may have), and these similarities 
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may be evaluated, if at all, only by 
recourse to analytical measures. A 
particular detail in a shape may or 
may not be critical to identification, 
depending upon the specific discrimi- 
nations which identification requires. 

Gestalt measures, as defined earlier, 
all involve a reduction in the dimen- 
sionality of figures (sometimes, though 
not necessarily, to a single dimension) 
with a concomitant discarding of in- 
formation. The number of operations 
by means of which a shape may be 
“collapsed”’ to lower dimensionality 
is indefinitely large, as Selfridge (26) 
has recently pointed out. At the 
simplest level, for example, we may 
literally collapse a shape upon any 
spatial axis by plotting, as a function 
of distance along that axis, the thick- 
ness of the shape in the orthogonal 
dimension (26, Fig. 3). The axis in- 


volved need not even be linear; e.g., 
it might be a circle about the center 
of gravity of the shape (cf. Pitts and 
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McCulloch, 24). 

Of all the conceivable physical 
measures of shape, analytical as well 
as gestalt, there are undoubtedly 
many that have little or no value 
from a psychophysical point of view. 
On the other hand, it appears un- 
likely that any single system of physi- 
cal measurement can be optimal for 
all psychophysical situations: in other 
words, we are suggesting that form 
perception involves a number of dif- 
ferent psychological mechanisms 
which function in a complementary, 
and to some degree overlapping, man- 
ner. Unfortunately, there is no quick 
and easy way to determine which 
physical measurements have greatest 
psychological relevance; only experi- 
mentation can answer this question. 
The preceding discussion and review 
may at least serve, however, to‘allevi- 
ate somewhat the paucity of hypoth- 
eses which in the past has charac- 
terized this research area. 
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THE NORMAL CURVE AND THE, ATTENUATION 
PARADOX IN TEST THEORY 


LLOYD G. HUMPHREYS! 
University of Illinots 


The appearance of Loevinger’s 
paper (2) on the attenuation paradox 
in test theory was the precipitating 
factor in the writing of this note.? In 
reacting to her development of the 
paradox (supposed lack of monotonic 
relationship between reliability and 
validity) certain biases concerning 
test theory and test statistics which 
the writer has held for several years 
were crystallized. 

Bias Number One. Let's forget our 
fixation on the normal curve in test 
theory. 

Bias Number Two. 
tistics appropriate 
point distributions. 

In support of these biases the fol- 
lowing two arguments are offered: 

1. Test score distributions are 
rank-order, point distributions. The 
underlying trait may or may not be 
continuously and normally distrib- 
uted, but such speculation is of no 
import. Psychological tests furnish 
rank-order information only and, 
furthermore, we have few prospects 
of obtaining devices of any other 
type. Criteria, on the other hand, 


Let’s use sta- 
for rank-order, 


1 Visiting professor, University of Illinois, 
fall semester, 1955; on leave from Personnel 
Research Laboratory, Air Force Personnel and 
Training Research Center. This article is 
based in part on work done under ARDC 
Project No. 7702 in support of the research 
and development program of the Air Force 
Personnel and Training Research Center, 
Lackland Air Force Base, Texas. Permission 
is granted for reproduction, translation, pub- 
lication, use, and disposal in whole or in part 
by or for the United States Government. 

* The writer is indebted to Drs. Robert 
Travers, John Leiman, and John Schmid, 
Jr., for critical reading of this manuscript. 


472 


may occasionally be continuously 
distributed and certain of these dis- 
tributions may be normal, but cri- 
teria also are more frequently in the 
form of tests, ratings, rankings, pass- 
fail, and other point distributions. 

2. If no assumption is made con- 
cerning the shape of the criterion 
distribution in the work of Loevinger 
(2), Brogden (1), and Tucker (4), 
there is no paradox. For example, if 
all items in a test have difficulty val- 
ues of .5 and if all intercorrelations 
of items are equal, the relationships 
contained in Table 1 between num- 


TABLE 1 


VALIDITY AS A MONOTONIC FUNCTION 
OF RELIABILITY 
Test Validity 
18 45 90 


Items Items Items Items 


Item 
Inter. 
-r 


.83 .92 .96 
.85 91 .96 .98 
.90 a .98 .99 
94 : .99 .993 
.96 : 991 996 
.97 ¥ .994 .997 
.98 ‘ .997 .998 
.993 .999 999 


ber of items, item reliability, or level 
of interitem correlations, and validity 


of total scores are obtained. It is 
seen that the relationship between 
reliability and validity 7s monotonic. 

Discussion. In obtaining the above 
results the same assumption about 
item validity made by preceding 
writers was used, i.e., each item ex- 
cept for ¢rrors of measurement is a 
true measure of the criterion. This 
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means that the validity of an item is 
the square root of its reliability. In 
the present case reliability is indi- 
cated by the phi coefficient between 
items in the test, and the validity is 
a point biserial between the item and 
‘““true’’ score. The values in the table 
are those obtained by applying the 
usual formula for the correlation of 
sums. Please note that here and else- 
where, when the term ‘correlation’ 
is used, a product-moment correla- 
tion is assumed. 

The reader may have difficulty vis- 
ualizing the shapes of these criterion 
distributions since the definition of 
true is tied to the concept of infinity. 
The amount of error isn’t great in 
any derivation, however, if one sub- 
stitutes a large number for infinity.’ 
One thousand items will give results 
reasonably comparable to infinity 
—ten thousand would be eminently 
safe—and the shape of the distribu- 
tion can actually be worked out. Suf 
fice to say, however, that a criterion 
distribution, as defined for Table 1, 
will not be normal when all 
have difficulty values of .5 unless 
item intercorrelations are zero. For 
the same item difficulty specifica 
tions the distributidn becomes rec- 
tangular when item correlations reach 
4 and becomes increasingly U shaped 
as item correlations increase from } 
to 1.00. 

The importance of the assumption 
concerning normality of criterion dis- 
tribution is made clear in Table 2. 


items 


* The mathematician, H. T. Davis, made 
this suggestion in principle in a class at 
Indiana University in 1935-36. He stated 
that if mathematicians substituted a ‘‘very 
large number” for infinity in their calculations 
they would not obtain significantly different 
answers and their assumptions would have 
operational meaning. This suggestion seems 
peculiarly appropriate for test theory. For 
the latter theory the number does not need 
to be nearly as large as Dr. Davis envisioned. 
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TABLE 2 


COMPARISON OF ITEM VALIDITIES WITH AND 
WITHOUT THE ASSUMPTION OF CRI- 
TERION DISTRIBUTION NORMALITY 


Com- 
parison 
Values 


Item Relia- Item Validities 
bilities 
? pbis 
com- 
| puted 
| 
from 
Tois 


| 
1% phi, 
| Or lpbis 


V tet, 


OT Tbie 


063 316 251 | 251 
128 447 | 358 | 358 
194 548 440 | 438 


262 632 §12 | 506 


333 707 577 
410 775 | 640 620 
493 837 | 702 669 
590 894 768 716 
713 949 


5606 


This table was constructed by first 
certain item. reliabilities 
stated in terms of the tetrachoric cor- 
relation. These values are in Column 
1. Column 2 contains the correspond- 
ing phi coefficients for the same items. 
Column 3 contains the item validi- 
stated as continuous biserials, 
when the criterion is assumed to be 
a true, normally distributed measure 
of the function measured by the 
item. Values in Column 3 are com- 
parable to item validities used previ- 
ously by Brogden and Loevinger. 
Column 4 also contains item validi- 
ties, stated as point biserials, but the 
criterion is assumed to be the sum of 
an infinite number of items, of a 
given level of reliability, whose dis- 
tribution takes the shape dictated 
by their intercorrelations. Column 
5 also contains point biserials, but 
these were computed from the con- 
tinuous biserials in Column 3, which 
were based on an assumed normal 
distribution, by multiplying each by 
the expression z/\/pg. Comparison 


assuming 


ties, 
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of Columns 4 and 5 shows how the 
assumption of normality attenuates 
item validities with the error becom- 
ing progressively larger with higher 
validities. 

Similar tables can readily be com- 
puted for other levels of item diffi- 
culty. Again, there is no paradox, 
but the criterion distributions are 
skewed as well as flat when item inter- 
correlations are greater than zero. 
The assumption of a normal distribu- 
tion of the criterion is not compatible 
with the mechanics of adding test 
items together. 

The problem is more complex if 
item difficulties vary. Item relia- 
bility can no longer be estimated from 
the intercorrelations of the items in 
the test, but must be defined as cor- 
relation with a comparable item in 
another test. The comparable item 
must measure the same function and 
must have the same mean and vari- 
ance. Intercorrelations of items hav- 
ing different means and variances, 
but otherwise measuring the same 
function equally reliably, will be 
lower than the products of the square 
roots of their reliabilities. 

With items distributed in difficulty 
there is again no paradox, however, 
since spread of item difficulties will 
also affect the shape of the criterion 
distribution. Variance of item diffi- 
culties forces scores toward the center 
of the distribution and thus counters 
the effect of item intercorrelations. 

It is still possible to argue that a 
paradox is involved since classical 
test theory does not allow for the 
flexibility in shape of distribution re- 
quired in order for the classical for- 
mulas to be applicable. The locus 
of the paradox can, however, be more 
precisely stated. In order for the rela- 
tionship between validity and reli- 
ability to hold, one cannot keep con- 
stant both the form of the criterion 
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distribution and the distribution of 
difficulties of the test items. 

Conclusion for test construction. 
The test technician should proceed 
with the job of test construction with- 
out making obeisance to the normal 
curve. His decisions should be made 
in sequential fashion from most to 
least important. The shape of his 
test score distribution is a decision 
made late in the sequence and his de- 
sires about shape of distribution 
should not lead to reversals of earlier, 
more important decisions. It should 
also be noted that all of his decisions 
are made with a particular group of 
examinees in mind, since the level 
and range of their ability are crucial 
factors in the writing and selection of 
test items. 

The first step in test construction 
is to draw up specifications for the 
test. Decisions made at this stage 


should not be changed, uncon- 


sciously, by later statistical computa- 


tions of the sort used in item analysis, 
measures of test reliability, and 
measures of test homogeneity. Blind 
application of statistical procedures 
may change the nature of the test. 

For example, the test may be de- 
signed to predict a particular com- 
plex criterion. Items will then be in- 
cluded in numbers such that their 
weight in the total score will be opti- 
mum for the purpose. Selection of 
items on the basis of correlation of 
items against total test score would 
obviously be inappropriate. A Kuder- 
Richardson homogeneity coefficient 
would also be inappropriate for the 
test as a whole. 

One may also be interested in 
measuring a psychological “trait.” 
In this case the tendency is to think 
of the problem in terms of the ho- 
mogeneity of the items on the 
grounds that a heterogeneous test by 
definition cannot measure a unitary 


\ 
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trait. If homogeneity is defined as 
level of item intercorrelations, how- 
ever, there is again the possibility of 
error in the blind following of statisti- 
cal indices. Let us suppose that a 
mechanical information test is de- 
sired. The following are possible ex- 
amples of such tests in descending 
order of item intercorrelations (dif- 
ficulty level of items being held con- 
stant): (a) Information about the 
crosscut saw and its use; (b) Infor- 
mation about saws and their uses; 
(c) Information about woodworking 
tools and their uses; (d) Information 
about tools and their uses in wood- 
working, plumbing, metal working, 
automotive repairing, etc. 

For many purposes test d may be 
most desirable, though its homogene- 
itv as defined above is lower than for 
the other tests in the series. This 
that the test specifications 
must indicate how broadly this test 
should be Even a fairly 
broad test may be relatively homo- 
geneous, however, in that the itemsin 
the test may still be more like each 
other than items in other tests in the 
same battery (3). 

High item reliability is always de 
sired. Nothing is gained from low 
reliabilities. The reader must 
remember, however, that item relia- 
bility is defined as the correlation 
with another comparable item, and 
is not estimated from correlations 
with all other items in the test. Hence 
there is no contradiction between the 
present advice to achieve high item 
reliability and that given above 
which was to select a desired degree 
of homogeneity. The test constructor 
should, therefore, as his next step, 
write the most reliable items he pos- 
sibly can for the function he wants to 
measure. By necessity, though not 
from choice, item reliabilities will 
often be quite low because reliable 


means 


defined. 


item 
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measurement in many areas is diffi- 
cult. 

One cannot be as dogmatic about 
high test reliability as about high 
item reliability. Test reliability is a 
function of item reliabilities and item 
intercorrelations; i.e., test reliability 
is in part a function of homogeneity. 
High test reliability can be achieved 
by narrowing the focus of the test 
and attaining high homogeneity. 
Care must be exercised in item selec- 
tion, therefore, not to confuse item 
reliability and homogeneity and 
thereby change the function meas- 
ured by the test. The test technician 
must maintain his original specifica- 
tions in spite of temptations to in- 
crease test reliability. 
The 


shape of 


next decision concerns the 
the distribution of test 
scores desired. Depending on the 
purpose of the test, the desired dis- 
tribution may be normal, platykurtic, 
skewed, or U shaped. For a general 
purpose test the writer submits that 
a rectangular distribution is most 
useful this distribution most 
accurately represents the information 
furnished by a_ psychological test. 
That is, the rank-ordering of persons 
is accomplished equally well in all 
parts of the range when the distribu- 
tion is rectangular. This means that 
reliability of discrimination is maxi- 


since 


mized over all. 


The desired shape is achieved or, 
more commonly, approached by con- 
trolling the difficulty levels of the 


test items. Item difficulties alone are 
manipulated because previous deci- 
sions have fixed the general level of 
item intercorrelations that are possi- 
ble. With high item intercorrela- 
tions, a constant level of item diff- 
culty will produce a U-shaped distri- 
bution. As the variance of item diffi- 
culty increases, the peaks of the U- 
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shaped distribution will converge to 
the center of the distribution. 

The reader should be warned that 
some of the shapes of test score dis- 
tributions are highly theoretical in 
terms of the characteristics of items 
available for most measurement pur- 
poses. One practical outcome, how- 
ever, is to question the decision made 
automatically by most test con- 
structors to vary the difficulty levels 
of the items in the test. With low 
item intercorrelations of the sort ob- 
tained in most aptitude tests only by 
careful selection of the most reliable 
items at a constant level of difficulty 
can a rectangular distribution be 
approached. 


SUMMARY 


1. The attenuation paradox in 
test theory is a result of the assump- 
tion made by previous writers of a 
continuous normal distribution of the 
criterion. 


2. There is no paradox if the cri- 
terion distributions can assume any 


shape. If this is considered ipso 
facto paradoxical, then the locus of 
the paradox is in one’s inability to 
hold constant both the shape of the 
criterion distribution and the distri- 
bution of item difficulties. 

3. The pervasive use of the as- 
sumption of continuous normal dis- 
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tributions in test theory and test 
statistics is questioned on grounds 
that test data are in the form of rank- 
order, point distributions. 

4. The test technician should make 
decisions in constructing a test in a 
particular sequence. This sequence 
is as follows: 

a. Outline his test specifications. 
This will specify the desired degree 
of homogeneity (level of item inter- 
correlations) wanted in the test. High 
homogeneity is not necessarily de- 
sirable. 

b. Write the most reliable items 
possible to measure the desired func- 
tion or functions. Items of low relia- 
bility are never desired. 

c. Do not always try to maximize 
test reliability, since the latter is a 
function both of item reliability and 
homogeneity. The desired degree of 
homogeneity should be maintained 
even if item-test correlations are low. 

d. Select the form of the raw score 
distribution of test scores desired. 
This can be any form, though a rec- 
tangular distribution is recommended 
for a general purpose test. 

e. Strive to obtain the desired form 
of distribution by varying item dif- 
ficulties only. Previous and more im-. 
portant decisions have fixed the level 
of item intercorrelations which is the 
other determiner of shape of distribu- 
tion. 


REFERENCES 


1. BroGpEn, H. E. Variations in test validity 
with variation in the distribution of item 
difficulties, number of items, and degree 
of their intercorrelation. Psychometrika, 
1946, 11, 197-214. 

2. LOEVINGER, JANE. The attenuation para- 
dox in test theory. Psychol. Bull., 1954, 
51, 493-504. 

3. LOEVINGER, JANE, GLESER, GOLDINE C., 


& DuBois, P. H. Maximizing the dis- 
criminating power of a multiple score 
test. Psychometrika, 1953, 18, 309-317. 

4. Tucker, L. R. Maximum validity of a 
test with equivalent items. Psycho- 
metrika, 1946, 11, 1-13. 


Received December 6, 1955. 








PSYCHOLOGICAL BULLETIN 
Vol. 53, No. 6, 1956 


THE ABILITY OF HUMAN OPERATORS TO DETECT 
ACCELERATION OF TARGET MOTION! 
ROBERT M. GOTTSDANKER 

Santa Barbara College, University of California 


In the tracking task, or indeed in 
any task requiring adjustment to 
moving objects, the operator is often 
confronted with targets which, while 
preserving their general directions, 
change their velocities. It is the ob- 
ject of this survey to describe the ex- 
perimental literature which 
with responses that are made to such 
accelerated motion. Of particular 
interest in relation to tracking be- 
havior is the extent of acceleration 
which must occur in order for it to 
be noticed. The tracker’s ability to 
match target velocities with his own 
movements must depend in part on 
his sensitivity to change in velocity. 
Other information, though less obvi- 
ously applicable may help in com- 
pleting the picture of how ¢he op- 
erator responds to changing veloci- 
ties. : 

All of the findings on the topic of 
response to target acceleration that 
the writer has been able to unearth 
are included in the present review. 
Actually, very few studies have dealt 
with this problem. Further, it was 
not always the primary focus of the 
investigation. For this reason, in dis- 
cussing a study scant consideration 
may be given to its major objective. 
Instead, the aspects which bear on 
the present subject matter will be 
emphasized. 
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The making of a systematic evalu- 
ation of the present status and future 
possibilities of work on response to 
acceleration necessitates the locating 
of this problem within the more gen- 
eral framework of response to target 
motion. To do this it will be neces- 
sary to describe some aspects of stud- 
iesonconstant-velocity motion. How- 
ever, there is no intention of making 
the present survey broader than is 
shown by the title. Consequently 
such important topics as perceived 
motion from discrete stimuli, induced 
motion, one’s interpretation of his 
own relative motion, etc., will receive 
no mention. A fairly strict limitation 
of subject matter is mandatory be- 
cause the problems of perception and 
action in relation to motion include 
all of the variables of the stationary 
environment in addition to those in- 
troduced by motion. 


DESCRIPTION OF RELEVANT STUDIES 


The complication experiment. The 
classical complication experiment, 
with its ancestry in the personal 
equation of Bessel (2, p. 133), is the 
first of the situations in which the 
effect of acceleration of motion on 
judgment was studied. Wundt (21), 
using a complication pendulum, at- 
tempted to judge the position of the 
pointer at the sound of a bell stroke. 
In Wundt's arrangement, the pointer 
oscillated symmetrically about the 
straight-up position. Figure 1 shows 
a top-pointing pendulum at two posi- 
tions of its motion in the upper 
sketches. It was found by Wundt 
that during the positively accelerated 
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upward phase, errors of judgment 
were negative, i.e., the judged posi- 
tion was an earlier one than the point 
of actual coincidence. The black 
circle in the left-hand sketch illus- 
trates such a judgment. During the 
subsequent negatively accelerated 
downward phase, errors tended to be 
positive. An example is shown in the 
sketch on the right. In addition to 
these results, Wundt found negative 
errors to predominate for slow mo- 
tions and positive errors for rapid 
motions. Intermediate speeds were 
found which had no over-all constant 
error. These last findings were in 
agreement with the results obtained 
when he used a complication clock, 
which moves with constant angular 
velocity. 

Subsequent investigations by von 
Tschisch (20) and by Pflaum (18), 
also using the complication pendu- 
lum, support Wundt’s findings in the 
major respects. Von Tschisch utilized 
sense modes in addition to sound, 
e.g., touch, for his instantaneous 
stimulus and also used combinations 
of stimuli. Geiger’s studies (6) added 
several controls to the earlier work. 
Most important was to use Ss other 
than the experimenter (seven in 
number) and to vary the orientation 
of the pointer and semicircular scale 
on different groups of trials by optical 
methods. Using a constant-velocity 
complication clock, he found nega- 
tive errors to be typical for upward 
movement and positive errors for 
- downward movement, regardless of 
whether these phases occurred before 
or after the midpoint of the motion, 
or whether the pointer moved clock- 
wise or counterclockwise. Doubt was 
thus cast upon the importance of ac- 
celeration in producing negative or 
positive errors. In attempting to re- 
solve this question, Klemm (16) em- 
ployed both a top-pointing and a 
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bottom-pointing pendulum. Hefound 
that differences in the two phases of 
motion were more marked for the 
top-pointing pendulum. From this 
he concluded that both the sign of 
acceleration and the direction of 
movement were operating factors. 
In the top-pointing pendulum the 
factors work in the same direction 
and thus reinforce one another, but 
in the bottom-pointing pendulum 
they work in opposite directions and 
are in conflict. Reference to Fig. 1 
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Fic. 1. RELATIONSHIPS IN EXPERIMENTS 
ON THE COMPLICATION PENDULUM 


shows that in the vop-pointing pendu- 
lum the two conditions which have 
been described as making for nega- 
tive error, upward motion and posi- 


tive acceleration, occur together. 
Similarly, downward motion and 
negative acceleration, both of which 
are described as making for positive 
errors, also coincide. For the bottom- 
pointing pendulum, the opposing 
conditions coincide: downward mo- 
tion and positive acceleration; up- 
ward motion and negative accelera- 
tion. 
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Burrow (5), in analyzing previous 
work on the complication experiment 
in conjunction with his own complica- 
tion-clock experiments, concluded 
that the effect of acceleration had 
never been demonstrated. It was his 
contention that the direction of error 
on the complication pendulum could 
be explained by the attraction of the 
midpoint of the scale, which becomes 
a kind of goal. However, such a 
central tendency should bring about 
a positive error during the initial 
phase and a negative error during the 
final phase. The results obtained by 
Wundt and his successors were pre- 
cisely the opposite. 

Prediction-motion. Continuative 
responses have been studied in a se- 
ries of experiments by the writer (8, 
9,10). The S tracked a small mov- 
ing target in a_ paper-and-pencil 
tracking box. On some trials the tar- 
get had a constant velocity of motion 
but on others the motion was either 
positively or negatively accelerated. 
The special task given to S was to 
continue his tracking responses as if 
he were tracking an airplane which 
had momentarily disappeared behind 
a cloud. The general outcome, as far 
as the accelerated paths of motion 
are concerned, is that S’s continua- 
tions were made at a constant veloc- 
ity rather than at an accelerated one; 
further, this velocity did not match 
the terminal velocity of the visible 
target. For positively accelerated 


motions, the continuation velocity 
was lower than the terminal velocity, 
and on negatively accelerated mo- 


tions it was higher. This was inter- 
preted as reflecting an averaging or 
integration of preceding velocities. 
Threshold for sudden changes in 
velocity. The detection of sudden 
changes in target rate has been stud- 
ied by Hick (12). In one procedure, 
called the “drum method,” target 
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motion was generated by passing an 
endless belt on which were printed 
sloping lines under a mask which had 
an open slot perpendicular to the 
path of motion. The target moved 
at a constant velocity in the slot for 
between two and four seconds, at 
which time the velocity changed 
instantaneously. The S was to indi- 
cate each time he saw a change and 
to indicate whether it was an increase 
or a decrease. The relative mean 
threshold of change (0.5 probability) 
was 12 per cent of the initial velocity 
when the angular velocity of motion 
was 4.18°/sec. but increased as lower 
target velocities were employed, the 
corresponding values being 41 per 
cent for the 0.38°/sec. velocity and 
133 per cent for the 0.11°/sec. veloc- 
ity. It was Hick’s feeling that the 
12 per cent value at the most rapid 
motion represented a true threshold 
but that the high values for the 
slower motions were artifacts. Dur- 
ing a crossing with slow motion there 
were often a number of changes of 
velocity. The fact that S’s errors on 
slow crossings were usually failures 
to respond rather than the making 
of incorrect responses is indicative 
of inattentiveness. 

In the other procedure, called the 
‘oscilloscope method,”’ targets were 
generated on the face of the tube. 
Here S was shown the position where 
the change would take place. He also 
could hear the click of a relay when 
the change did occur. Again, S’s 
task was to indicate whether the 
change was an increase or a decrease 
in velocity. For increases in velocity, 
the relative threshold values ranged 
from slightly below 10 per cent to 
slightly above 13 per cent for veloci- 
ties lying between 0.15 and 10.25°/ 
sec. For decreases in velocity, thresh- 
old values over the same zone of ve- 
locities ranged from above 7 per cent 
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to about 21 per cent. In some cases, 
changes of as little as 2.5 per cent 
were detected significantly better 
than chance. Reducing the view- 
ing period to as short a time as 0.5 
sec. did not reduce the accuracy of 
judgments. However, presenting 
two targets which crossed before each 
changed velocity in identical fashion 
did elevate the threshold somewhat. 

Phenomenological description of har- 
monic motion. Two investigators 
have obtained phenomenological re- 
ports based on the presentation of 
harmonic motion. Metzger (17) used 
a technique in which a fixed light 
source threw shadows of moving ob- 
jects upon a translucent screen, on 
the other side of which was the S. 
One or more vertical rods mounted on 
a horizontal turntable provided tar- 
gets, each rod giving rise to a shadow 
which moved from side to side in a 
sinusoidal fashion. Metzger found a 
great preference for continuous paths 
of motion. For example, after two 
shadows joined together, the two 
“new” objects seen as arising were 
inevitably those which continued the 
previous directions and _ velocities 
rather than those which reversed or 
modified the directions and velocities. 

More pertinent for the present in- 
terest are the recent investigations by 
Johansson (14). This investigator 
also used a shadow technique in con- 
junction with a translucent screen. 
However, he pasted small target 
objects upon a transparent plate 
which was located between the light 
source and the translucent screen. 
The plate was moved horizontally in 
a harmonic manner by means of an 
eccentric drive. The importance of 
this study and of Johansson’s subse- 
quent major study (15) concerns the 
problem of perceptual organization 
of all major elements in a field, rather 
than observations on acceleration. 
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Nevertheless, there is a clear state- 
ment regarding the perception of ac- 
celerated motion (14, p. 32). 


When an O is shown this kind of motion 
passing through a homogeneous field, and is 
asked how the velocity of the object behaves 
if different parts of the path of motion are 
compared one with another, or in other words, 
if the velocity changes along the path, practi- 
cally without exception the same answer is 
received: the point moves slowly just at the 
turning point; but otherwise its velocity is 
constant. 


DISCUSSION 


General observations on studies re- 
ported. It is evident from the forego- 
ing survey that the information on re- 
sponse to acceleration of target mo- 
tion is meager. Nor are the small 
caches of knowledge strategically 
placed for either theoretical or practi- 
cal purposes. Above all, little as yet 
can be stated in quantitative terms. 

The statement by Johansson may 
be broadened to include motions 
other than harmonic in the following 
way: When target velocity changes 
gradually, a person can tolerate a 
good deal of such change without re- 
alizing that the speed is not constant. 
This generalization may prove to be 
of value in the development of a 
coherent point of view. A closely re- 
lated but not identical suggestion is 
that the operator’s perceptual mech- 
anism integrates smoothly changing 
velocities over a considerable period 
of time. Evidence for this conclusion 
was obtained by the writer in his stud- 
ies of prediction-motion. It is also the 
view of the writer that the early work 
on the complication pendulum may 
be explained partially in terms con- 
sonant with the foregoing formula- 
tions. First, Smust require some time 
after the instantaneous signal to be- 
come aware of it; an appreciable reac- 
tion time is one of the most predicta- 
ble aspects of behavior. Next, it is 
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clear that S allows for hisreaction time 
in making his judgment of sound-pen- 
dulum coincidence, otherwise his 
error would always be positive. This 
is not the case: during the positively 
accelerated phase of harmonic mo- 
tion, S judges the pendulum position 
to be an earlier one than is actually 
correct. It is hypothesized that S 
attempts to extrapolate backward to 
the extent of one reaction-time inter- 
val. If he uses the velocity existing 
during a brief period after the in- 
stantaneous signal for the operation, 
apparently being content to disre- 
gard the fact that the velocity is 
changing, the obtained results would 
be expected. In the phase of positive 
acceleration he appears to base his 
extrapolation upon velocities that 
have become too high for correct 
localization and in the phase of nega- 
tive acceleration upon velocities that 
have become too low. It should be 
pointed out that the same qualitative 
predictions would be made by assum- 


ing that S computes rates by instan- 


taneous differentiation rather than 
by integrating over a period of time. 
The quantitative data do not allow 
selecting between the alternatives. 
In any event, S acts very much as 
though he were unaware that the 
velocity is changing. 

The study by Hick, on the other 
hand, shows S to be an extremely ac- 
curate discriminator of velocities, one 
who can sometimes detect an increase 
of 2.5 per cent and who needs no more 
than 0.25 sec. either before or after 
the change to make the discrimina- 
tion. How may this description be 
reconciled with that of the uncritical 
S who can tell that harmonic motion 
is changing in velocity only at the 
periods near reversal of direction? 
The obvious difference between the 
stimulus conditions for the disparate 
observations is in the gradualness of 


481 


transition from one velocity to an- 
other. During a course of harmonic 
motion, acceleration is least just at 
the times that velocities are greatest, 
in the region of midcourse. Because 
there is so little relative acceleration 
in this region, it would be expected 
that it would go unnoticed. The re- 
verse holds true near the ends of the 
motion where the ratio of accelera- 
tion to velocity becomes high and 
finally approaches a value which is 
infinitely high. At the time of an in- 
stantaneous change of velocity, as 
introduced by Hick, acceleration is 
naturally infinitely high. 

Analysis of thresholds already de- 
termined. \t may be profitable to 
consider the response to target ac- 
celeration in the general context of 
research on the perception of motion. 
This approach should be particularly 
useful in clarifying the language and 
and problems in the determination of 
thresholds. A systematic analysis of 
thresholds both obtained and obtain- 
able might accomplish several things. 
First, the very organization of the 
material should indicate the voids as 
well as the islands in our present 
knowledge. Second, it should de- 
lineate the operations necessary for 
the obtaining of thresholds, and the 
primary variables whose values must 
be specified. Third, it should reveal 
parallels among thresholds which 
have been studied and suggest extra- 
polations to kinds of motion beyond 
the scope of the original studies. A 
graphic technique and a system of 
notation were devised for conducting 
this analysis. 

The significant kinds of threshold 
relating to motion which are de- 
scribed in the literature are repre- 
sented in Fig. 2. The first has been 
called the threshold of motion (shown 
in 2A). Angular distance as a func- 
tion of time is represented in the 
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graph on the left. On the right, the 
function is that of angular velocity 
against time. Gordon (7), in a recent 
experimental study, definesthe thresh- 
old of motion as the lowest detectable 
angular velocity (distinguishing it 
from the threshold of displacement, 
which is the smallest angular distance 
over which motion of a given rate 
may be detected). In place of motion, 
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Fic. 2. GRAPHIC REPRESENTATION OF 
Motion THRESHOLDS WuicH HAVE BEEN 
DETERMINED 


which is a general word, the present 
writer would substitute velocity, the 
unit in which the threshold is meas- 
ured. Further, since this is an abso- 
lute threshold, it should be called ab- 
solute threshold of velocity. Represent- 
ed in both graphs of Fig. 2A are three 
test motions:f,g,. Motion fisshown 
as being more rapid than is necessary 
for detection, and motion hk is too 
slow to be detected. Motion g (the 
dark line) is that whose velocity just 
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permits detection. In the graph on 
the left, the threshold is the slope of 
the g line or ds,/dt; in the graph on 
the right, the threshold is shown by 
the height of the g line, or v,. The 
motions shown have been equated in 
time but could as logically have been 
equated in distance. Also, the par- 
ticular time used is an arbitrary de- 
cision of the experimenter. Conse- 
quently, the value of either the fixed 
time or distance employed must be 
specified. Troland reports that the 
minimum values generally found for 
this threshold lie between 1’ and 
2’/sec. (19, p. 380). 

Whereas the foregoing threshold 
is an absolute threshold, indicative of 
S’s accuracy in distinguishing be- 
tween motion and no motion, that 
represented in Fig. 2B is a difference 
threshold. It is a measure of how 
well S distinguishes between two mo- 
different velocities. This 
problem has not been studied in as 
tine detail as the absolute threshold. 
in the course of a series of investiga- 
tions by J. F. Brown on the percep- 
tion of motion, the study by Brown 
and Mize (4) furnishes an expression 
of the accuracy of Ss in equating the 
velocities of two sets of moving 
squares on endless belts. The pro- 
portional difference required for dis- 
crimination (Weber fraction) given 
by the writers is 0.024. However, it 
should be noted that this value actu- 
ally refers to the constant error (or 
bias) obtained with the method of 
limits and so does not carry the im- 
plied meaning. If the standard and 
test are shown under the 
same conditions, there should be no 
over-all bias whatsoever, but this 
obviously does not mean that the 
matching is perfectly accurate. 

Another variety of study bearing a 
relation to difference threshold is 
that of the motion parallax cue of 


tions of 


motions 
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distance discrimination. If two ob- 
jects move at the same linear veloc- 
ity, it is possible to tell which is 
nearer because it will have a higher 
angular velocity. A study by Graham 
et al. (11) shows that for two vertical 
needles moving horizontally at right 
angles to the line of regard, a differ- 
ence in distance of the needles from 
the S which gives rise to a differential 
angular velocity of about 30”/sec. 
will provide a threshold distance dis- 
crimination. 

The graphs for difference threshold 
of velocity, are shown in Fig. 2B. The 
axes carry the same meaning as in A. 
The dotted line, 7, represents the 
standard motion; f, g, and h repre- 
sent three test motions which are 
respectively more divergent from the 
standard than is necessary for dis- 
crimination, just detectably different 
(at the threshold), and too much like 
the standard to be distinguished from 
it. On the left, the threshold is the 
difference in slopes between test line 
g and standard line 7, or ds,/dt—ds;/ 
dt. On the right the threshold is 
shown by the difference in height be- 
tween lines g and j or v,—7;. In addi- 
tion to the specification of time of 
presentation, it is evident that as the 
standard velocity is arbitrary, it too 
must be specified. As the graphs are 
merely illustrative no attempt has 
been made in Fig. 2 (or Fig. 3) to 
maintain equivalent the 
left and right sides. 

A somewhat related experiment 
should be mentioned. This is the field 
study by Biel and G. E. Brown (1), in 
which Ss were asked to estimate the 
linear velocities of various airplanes 
during their courses of motion. Low 
velocities were overestimated and 
high ones were underestimated. The 
S’s knowledge of the performance 
characteristics of the several types of 
aircraft used had considerable influ- 


scales on 


483 


ence on his judgments. Such a study 
could be represented in Fig. 2B by 
showing the actual rate of the plane 
as the dotted line, S’s mean judg- 
ment as a solid line, and his variabil- 
ity as a zone about his mean judg- 
ment. It also would be necessary to 
have the y axis represent linear in- 
stead of angular distance. 

As may be seen in Fig. 2C, the Hick 
experiment on the detection of in- 
stantaneous change of velocity may 
be represented formally in much the 
same manner as experiments on the 
difference threshold (Fig. 2B). The 
standard motion is now shown to pre- 
cede the test motions. Of course, on 
any one trial only one of the alterna- 
tives follows the standard. As the 
times of presentation of the standard 
and test motions may be varied inde- 
pendently of one another, both must 
be specified in- addition to the veloc- 
ity of the standard motion. 

Thresholds of acceleration. As was 
pointed out in the previous discus- 
sion, the discrepancy between Hick’s 
results and those of investigators us- 
ing harmonic motion could be at- 
tributed to difference in the response 
to smooth change in velocity and to 
discontinuous change. As far as 
thresholds of acceleration are con- 
cerned, further work with harmonic 
motion would appear to be of limited 
value as this motion is a single com- 
plex case in which the extent of ac- 
celeration runs the gamut during each 
cycle. Also, all higher orders of deriv- 
atives are present as well as accelera- 
tion. The detection of instantaneous 
change, although dealing with simple 
linear velocities (except at the point 
of change) is also a special case in 
which acceleration takes on an in- 
finite value. The general case for 
study would be that in which there 
was a constant amount of accelera- 
tion during a motion. Test motions 
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with different amounts of accelera- 
tion could then be compared. This 
would parallel the work on the abso- 
lute and difference thresholds of 
velocity. The kind of motion to be 
studied is necessarily that which is 
represented by the equation s=nt 
+mt?, the equation which provides 
for aconstant amount of acceleration. 
(Here s is usually measured in angu- 
lar units and ¢ in seconds of time.) 

No description has been reported 
of an experimental determination of 
threshold of acceleration, although 
Hick and Bates (13) report an im- 
pression gained from preliminary in- 
vestigation that rate must double 
every five seconds for acceleration to 
be noticed. Several different experi- 
mental procedures suggest themselves 
for obtaining absolute threshold of 
acceleration. Some of these may be 
mentioned. First, there could simply 
be judgment by S for each of the 
test motions as to whether the mo- 
tion was accelerated. Second, stand- 
ard constant-velocity motion could 
be presented in paired trials with the 
various test motions, S’s task being 
to decide which member of the pair 
was accelerated. Third, a technique 
analogous to that of Hick’s could be 
used in which S would be required to 
judge whether a test motion was pos- 
itively or negatively accelerated. 
Whatever technique is adopted, sev- 
eral test motions will be used which 
differ in amount of acceleration. 
They can be equated in both time 
and distance, and hence in 
velocity. 

In Fig. 3A the graphs represent a 
determination of the absolute thresh- 
old of acceleration, where each test 
motion is compared with a constant- 
velocity standard, which is shown by 
the dotted line. As in Fig. 2, the 
three solid lines, f, g, and h represent 
motions which are above threshold, 
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just at the threshold, and below the 
threshold. In the graph on the left, 
the threshold is represented by the 
second derivative of the function of 
the line g, d*s,/df?. In the graph on 
the right the threshold is the slope 
of line g, dv,/dt. 

It may be noted that a manipula- 
tion was possible in this experiment 
which was not possible in the de- 
termination of the thresholds of 
velocity: the motions differ only in 
one respect, acceleration, but are the 
same in time and distance. When 
experimenting upon the velocity 
thresholds, the test velocities are 
naturally different. But also if the 
motions are equated in time they 
must differ in distance and vice versa. 
Comparison may be made between 
the right-hand side of Fig. 3A and 
the left-hand side of Fig. 2A. In 
both, the equations are seen to be 
linear, and the statements of thresh- 
old are parallel, dv,/dt as compared 
with ds,/dt. The intersection of all 
the lines at the same point in Fig. 3A 
shows how it was possible to equate 
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velocity. A similar arrangement in 
the case of Fig. 2A would simply 
mean that the motions would center 
about the same position, a considera- 
tion which is irrelevant in the de- 
termination of thresholds. As far as 
required specifications for the abso- 
lute threshold of acceleration are 
concerned, since acceleration may be 
varied independently of both time 
and distance, the particular time- 
distance combination used must be 
specified rather than only one or the 
other as in the case of the absolute 
threshold of velocity. 

Also as yet undetermined is the dif- 
ference threshold of acceleration. 
What would be desired is a measure 
of the accuracy with which S is able 
to distinguish between one extent of 
acceleration and another. In Fig. 3B 
the operations involved in determin- 
ing such a threshold are shown. A 
motion with a standard acceleration 
is shown by the dotted line and the 
usual three test motions by the solid 
lines. As in the case of absolute 
threshold, it is possible to equate the 
motions in both time and distance. 
In the graph on the left, the threshold 
is equal to the difference between the 
second derivatives of the functions 
represented by lines g, and 7: d’s,/df? 
—d’s;/d?. In the graph on the right 
the threshold is equal to the differ- 
ence in slopes of the g and 7 lines: 
ds,/dt—ds;/dt. The same parallels 
and differences exist between the 
right-hand side of Fig. 3B and the 
left-hand side of Fig. 2B as in the 
comparison made for Fig. 3A and 
Fig. 2A. As in the case of absolute 
threshold of acceleration, the particu- 
lar time-distance combinations used 
must be specified. In addition, the 
value of the arbitrary standard ac- 
celeration employed must be stated. 

The question may have occurred 
to the reader of whether it is really 
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worth while to determine a diflerence 
threshold of acceleration. After all, 
there is no end to the order of deriva- 
tives of motion. Certainly at some 
point there must be an end to the 
utility of determinations of absolute 
and relative thresholds. Perhaps the 
real question concerns the kinds of 
discriminations the operator can 
make. Evidently, if values of the 
third derivative of distance are suffi- 
ciently high, it too may be detected; 
the intuitive term “‘jerk’’ has been 
applied to this characteristic by 
mechanical engineers. Perhaps it is 
beyond this point that the human 
operator has insignificant ability to 
discriminate. 

The constancy problem. One very 
important consideration has been 
slighted in all of the preceding discus- 
sion. It is that thresholds have been 
described in angular terms whereas 
an approximately linear path of mo- 
tion is probably more typical than a 
circular one. In the case of the abso- 
lute threshold of velocity the value 
for any given linear situation may be 
rather accurately specified in angular 
terms. This is because the are which 
would be subtended is so small (often 
less than one second) that the angular 
rate is essentially constant through- 
out. Circular motions have been used 
predominently in this work. In the 
studies of difference threshold of 
velocity, the paths are necessarily 
longer. Linear paths have been used 
in most of the studies. Obviously the 
angular velocity must vary from 
point to point. Yet the statements 
of threshold are usually given in an- 
gular terms. The reason for this is 
clear. It is thus in order that the 
threshold may be stated independ- 
ently of S’s distance from the moving 
object. In the same way, it may 
prove to be of more importance or 
interest to determine thresholds of 
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acceleration for targets moving in 
linear rather than circular paths. The 
same solution would appear to be 
necessary; a threshold would be 
stated in average angular value for 
the course of motion. 

The very fact that constant-veloc- 
ity linear motion is seen as such de- 
serves comment. It is a constancy 
phenomenon in the same sense as size 
constancy; equal linear extents are 
judged as equal when at different 
distances and thus differing in angu- 
lar extent. In the present case, the 
extent judged equal is linear velocity 
rather than linear distance. This 
point is not the same as that made 
by Johansson, (14, p. 255) who refers 
to the previously mentioned percep- 
tion of harmonic motion in which the 
velocity is taken to be unchanging 
for the greater extent of the motion 
as an example of constancy. There 
is no equating of equal linear extents 
but rather an inability to discrimi- 
nate among. different velocities 


whether considered as linear or as 
angular. 

Constancy of motion was one of 
the problems investigated by J. F. 
Brown (3). His method was to have 
an S match velocities of moving ob- 
jects which were at different dis- 


tances, a procedure which corre- 
sponds exactly to the typical experi- 
ment on size constancy. It should be 
remarked that within each of the mo- 


tions in this experiment there must 


necessarily be the single-object con- 
stancy already identified; each ob- 
ject, although taking on a range of 
values of angular velocity, appears 
to move at a constant linear velocity. 

A parallel may be found in judg- 
ments of static magnitudes. Shape 
constancy can also be looked upon as 
a type of single-object constancy. 


There is a constancy situation even: 


when a large square is put at some 
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distance trom the observer and di- 
rectly before him in the frontal plane. 
The angular distances are necessarily 
less at the two sides and at the top 
and bottom than in the center region. 
To the writer’s knowledge the ex- 
istence of a constancy phenomenon 
of objects so situated has neither been 
studied nor mentioned previously. 
When two objects are compared in 
the experiment on size constancy, 
there also exists the single-object con- 
stancy (of shape) within each of the 
figures. 

A related point on single-object 
velocity constancy is that not only 
does a target which is moving parallel 
with the ground change its angular 
velocity, but it also changes its angu- 
lar elevation. Angular elevation is 
low when the target is far off and 
high when it is near. The fact that 
it is seen as maintaining a level path 
could be called single-object con- 
stancy of direction. It would be of 
interest to know whether and to what 
extent a tracker is influenced by his 
tendency to perceive objects as mov- 
ing in a world of rectangular coordi- 
nates when his controls (such as 
cranks and hand-wheels) operate 
from angular inputs. 

No matter what aspect of the prob- 
lem of response to target motion is 
examined, it will be evident that far 
less has been done than remains to 
be done. Circular motion has been 
studied in some situations but not 
in others, similarly linear motion. 
There have been a few rather special 
studies of harmonic motion. How- 
ever, motion paths of greater com- 
plexity and in three dimensions have 
attracted no investigation. The pau- 
city of research on responses to ac- 
celerated motion and the absence 
even of discussion on higher order 
derivatives of motion has already 
been mentioned. The psychology of 
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response to target motion lies in the 
future. 


SUMMARY 


The experimental literature on re- 
sponses to acceleration of target mo- 
tion was reviewed. One significant 
observation was that smoothly ac- 


celerated motion is generally re- 


sponded to as if the velocity were 


constant. Suggestions were made of 
a basic approach toward obtaining 
thresholds of acceleration. Examples 
of studies on constant velocity mo- 
tion were included in order to develop 
a systematic graphic method of de- 
scribing experiments on motion. The 
phenomenon of velocity constancy of 
a single moving target was identified 
and generalized. 
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TRANSFORMED STATISTICS FOR USE IN 
TEST CONSTRUCTION 
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In most test construction situa- 
tions it is desirable, if not absolutely 
necessary, to select from the test 
items which are available either (a) 
those which contribute most to test 
reliability, or (6) those which have 
the strongest relationship to an ex- 
ternal (criterion variable, or else 
(c) those items which to some extent 
meet both requirements a and 8. In 
any case, a relatively large number 
of item-test or item-criterion statis- 
tics will usually be required in order 
to identify the items which will com- 
prise the best final test, and the en- 
suing computation can be very labori- 
ous. A number of writers (1, 2, 4, 6) 
have reported on the merits of group- 
ing the test or criterion distributions 
into a relatively small number of 
symmetrical categories for the pur- 
pose of simplifying the computation 
of such item statistics. The chief 
advantage of such coarse grouping is 
the increased economy in time spent 
on computation, which at the same 
time is accompanied by a minimum 
loss of information. There appears 
to be no readily available literature 
containing formulas which are both 
economical to apply and at the same 
time utilize highly efficient grouping. 
This paper is intended partially to 
remedy this need. 

It is well known that when fre- 
quency distributions are grouped 
into broad categories, the informa- 
tion lost decreases the efficiency of 
statistics computed from such data. 
It can be shown, however, that the 
loss is less for some kinds of divisions 
into categories than it is for others. 
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Flanagan (2) has shown that group- 
ing scores into symmetrically ar- 
ranged categories is relatively effici- 
ent when there are as many as five 
or seven categories. For example, 
seven categories containing, from low 
to high scores, the percentages of 
cases, 4, 8, 25, 26, 25, 8 and 4, for 
which the corresponding new scores, 
—3, —2, —1, 0, 1, 2 and 3, respec- 
tively, have been assigned will yield 
a (maximum) variance due to dif- 
ferences between categories of nearly 
95 per cent. The maximum variance 
between categories for five categories, 
if scored —2, —1, 0, 1, and 2, occurs 
when they contain, respectively, 9, 
20, 42, 20 and 9 per cents of cases; it 
is about 91 per cent. Traditionally 
much item selection has been carried 
out using distributions which have 
been divided, as recommended by 
Kelley (4), into only three categories 
containing 27 per cent low, 46 per 
cent middle and 27 per cent high 
scores. In this case, the variance be- 
tween categories is only 81 per cent. 
For a given number of categories, 
moderate variation in percentages of 
cases assigned to different categories 
lowers the maximum less than might 
be expected. 

Although a large number of item- 
test or item-criterion relationships 
may be required, only a measure of 
the relative strength of such relation- 
ships is most often needed. Because 
of this fact, and because of empirical 
evidence indicating high accuracy, 
as well as high efficiency, for coarse 
grouping methods (2), it would seem 
worthwhile in most test-construction 
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problems to follow Flanagan's recom- 
mendations: first to include a few 
additional cases to offset loss in effi- 
ciency, and then to apply a coarse 
grouping transformation. 


APPLICATIONS OF A PARTICULAR 
TRANSFORMATION 


Test (or criterion) means, and 
numbers of subjects N, and items m 
are invariants under the kind of area 
transformations discussed above. In 
any problem, because of the sym- 
metrical nature of the scores, 
the means become zero; and n and 
N, being independent of the cate- 
the trans- 
The variance of the new 
scores is of course constant for any 
particular set of categories, even 
though it has been relatively increased 
by the coarse grouping (as well as 
absolutely reduced because of the 
smaller range of the new = scores). 
Similarly, correlation coefficients com- 
puted from coarsely grouped scores 
are attenuated, so that, for example, 
if required item-test or item-criterion 
relationships are the typical point- 
biserial r’s, then a correction should 
be applied. 

It is possible, however, to choose 
an efficient set of categories which at 
the same time contains proportions 
of cases such that the correction for 
coarse grouping is implicit in the 
formulas. A set possessing this com- 
putational advantage is one contain- 
ing 9, 19, 44, 19, and 9 per cents of 
cases, the new scores being, respec- 
tively, 2, 1,0, —1 and —2. The be- 
tween-categories variance for this 
transformation is 90.5 per cent, which 
is almost the maximum obtainable for 
5 categories. Formulas are given be- 
low for the more useful statistics after 
this particular transformation has 
been applied to test and criterion 
distributions. 


new 


gories, are constants of 
formation. 
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If the transformed test scores are 
2, 1,0, —1 and —2, then the covari- 
ance C;r of the original scores of test 
T with item 2 becomes the trans- 
formed covariance, 


Cir’ = (2e+f—-g—2h)/N=Dir/N. [1] 


In [1] the frequencies of a (preferred 
or correct) response for item 7 for 
papers assigned scores 2, 1, —1 and 
—2 are, respectively, e, f, g, and h. 
Subsequently D’s such as Dyj7 will 
always refer to differences like the 
one in parentheses in [1], and primes 
will always indicate other trans- 
formed quantities. 

Next we write the item-test point- 
biserial correlation, 


rer=kri ‘: [2] 


where k& is the correction for grouping 
the test scores into the broad cate- 
gories. Assuming that the original 
test scores 7 are approximately nor- 
mally distributed, the value of k can 
be shown (5, pp. 393-402) to be 
1.051. The standard deviation for 
the chosen score set, 2, 1,0, —1 and 
—2, to which correspond categories 
containing percentages of cases 9, 
19, 44,19, and 9, is Sr’ = 1.049. Using 
these two values and [1] and [2], the 
item-test correlation, rir = Cir/S;Sr, 
is transformed as follows, 


Cir e( Cir’ ) 1.051 Dyr 
S.Sr  \SSr') 1.049 NS; 


Dir 
=. [3 

NS; 
Statistics such as S;, the standard de- 
viation of 1, which apply to items 
alone, are unaffected by the trans- 
formation [3]. Setting 1.051/1.049 
equal to 1.000 (instead of 1.002) in 
[3] introduces an error which for the 
present problem is negligible. 

Solving [3] for Cir, we now obtain 
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Cir = SrDir/N. [4] 


Replacing subscript T by subscript 
C in the foregoing equations gives 
analogous transformed values for a 
criterion distribution C; for example, 
expressions analogous to [3] and [4] 
are 


Cic/S:Sc=Dic/NSi, [5] 
and 
Co=S, D;c/N, [6] 


respectively. 

From a well-known relation (3) the 
variance V7 of a test T containing n 
items may be written as the sum of 
the m item-test covariances. Using 
this fact and summing [4] for items, 


Ve=2Ce=SrED2/N. 


Dividing [7] by Sr, an estimate of the 
standard deviation of the original 
test scores is obtained, 


Sr=Z2D7/N, [8] 


as a function of the item counts de- 
fined for use in [1]. Similarly, the 
square of [8] gives an estimate of the 
variance of the original scores. 

The validity coefficient, or correla- 
tion of test 7 with criterion C, may 
be written 


rrc=Cre/SrSc= ZCic/SrSc. [9] 


Substituting [8] and LCyc, which is 
[6] summed for m items, in [9], 


= 5 J 


Troe tevic [> Dr. (10] 


Thus [10] is the test validity esti- 
mated solely from item counts. 

For item-selection purposes it is 
often required that the criterion cor- 
relation of an item 7 significantly ex- 
ceed zero before including it in an 
experimental test. One way of 
achieving this is to include 7 only if 


Cic2 SiSc2/VN, [11] 


where z= 1.96, or some larger value 
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of the normal deviate corresponding 
to a known level of significance. Sub- 
stituting [6] in [11], 


Dic=Si2J/N. [12] 


Use of [12] as an item selection condi- 
tion has been discussed previously 
(6), where it was noted that setting 
S;=.5, the maximum value, provides 
a conservative statistical test which 
has the practical effect of insuring 
(a) that the test will contain (statisti- 
cally) valid items, (6) that most of 
the selected items will have large 
variances. 

Finally, it should be noted that 
{10} can also be written in another 
way, namely, 


rro=Rrre' 

= (1.051)°E7'C’/(1.049)7.V 

=PT'C'/N. [13] 
In [13] k? is the double correction for 
grouping both the T and C scores 
into the same-sized broad categories, 
and S;’=Sc’ =1.049. The final ap- 
proximation in [13], achieved by set- 
ting (1.051)?/(1.049)? equal to unity 
(instead of 1.004), is still close enough 
for our purposes. 

In some test-construction problems 
it may be easier to obtain the sum of 
transformed cross products 27’C’ 
and use [13] than it would be either 
to compute variances from original 
scores or to obtain additional item 
counts for the purpose of estimating 
the validity by using [10]. For ex- 
ample, suppose it is required to con- 
struct an experimental ‘“‘criterion- 
specific’ test by applying [12] to a 
pool of items. After papers have been 
grouped according to their criterion 
scores C, and items for which [12] 
holds have been selected,’ it is practi- 

1 The papers should be marked at the time 


[12] is applied to indicate later to which cri- 
terion distribution category they belong. 
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cally always then necessary to ob- 
tain the variance, validity, and re- 
liability of the test comprising the 
selected items. These values can im- 
mediately be approximated, with- 
out first having to tally raw test 
scores or to obtain item-test relation- 
ships, by using [13] as follows. 

First square [10], solve for (2D,7)?, 
and substitute the latter in the square 
of [8] to obtain an expression for the 
original variance, 


Vr=(ZDic)?/(Nrre)?. [14] 


Substituting [14] for the variance in 
the formula for the Kuder-Richard- 
son reliability formula 20, 


nN (Vrrc)*2V; 
rrr =— | 1-2 —— |. (15] 
n—1 (2Dic)? 


Summation terms in [14] and [15] are 
obtained by summing the Dye for 
items selected by [12], and the item 
variances V; are obtainable as usual 
from the total item counts, also avail- 
able after using [12]. The validity 
coefficient is needed and can be esti- 
mated by [13]; once computed it may 
then also be used in [14] and [15]. 

To obtain =T’C’ for use in [13], re- 
group the papers into the 5 cate- 
gories, this time ac ording to their 7 
scores, fill in the 5X5 contingency 
table for frequencies of the 7’ and C’ 
scores (center categories may be ig- 
nored), and sum the 16 kinds of cross 
products. After the papers have been 
reordered according to T scores, the 
remaining operations take only a few 
minutes, even when there are a large 
number of subjects. 

It should be emphasized that the 
accuracy of the formulas is immedi- 
ately dependent upon fulfillment of 
the normality assumptions concern- 
ing the original test and criterion dis- 
tributions. In the case of item sta- 
tistics, departures from normality 
are, for reasons already discussed, 
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not likely to be serious; however esti- 
mates such as [10], [13], [14], and [15] 
depend upon normality assumptions 
for two distributions and therefore 
in practice should be regarded only 
as rapidly obtainable approximations 
to the actual values. 


AN EXAMPLE 


In a large-scale research a subtest 
comprising 33 true-false personality 
inventory items was for several rea- 
sons of theoretical interest. The 
items were taken from a larger mas- 
culinity-femininity factor scale, and 
appeared to measure ‘‘fantasy, sensi- 
tivity and esthetic interest,’’ and pos- 
sibly also some kind of ‘‘neurotic con- 
flict.” The KR-20 reliability of the 
33 items was .71. 

The 33 item subtest, hereafter re- 
ferred to as ‘‘X,’’ was scored for a 
new random sample of 200 college 
women. Statistics for the obtained 
distribution corresponding to the 
first four moments were X = 19.555, 
Vx=21.587, g,=—.1851 and ge 
= — .4366. Although the distribution 
appeared slightly flattened and nega- 
tively skewed, test ratios for g, and 
ge (— 1.080 and —1.28¢, respectively) 
offered no evidence that the popula- 
tion distribution was anormal. 

Papers were divided according to 
X scores into the five categories rec- 
ommended above, and item counts 
for 636 other true-false items were 
obtained (see [1]). Application of 
[12] with z=2.58 and S,;=.5 selected 
89 of the new items as potential cor- 
relates of XY. (Since z was chosen to 
correspond to the .01 level, only 
about 6 would be expected by chance 
alone). 

At the same time that item counts 
were obtained for the new items, 
counts were also obtained for the 33 
items in XY. Since the reliability of X 
was only .71, it was not expected that 
its items would all correlate well with 
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the total score; indeed, applying con- 
dition [12] would retain only 22 of 
them. One empirical check on for- 
mula [8] was immediately available, 
however, using these counts; }D;x/N 
for the 33 items in X was 4.645, the 
square of which is 21.576, a value 
fairly close to 21.587, the actual 
sample variance of X. 

For purposes of this example, the 
89 new items were scored as a test 
T to be correlated with X. The 
papers were reordered according to T 
scores, and 27’X’ obtained from the 
contingency table for 7’ and X’; this 
value was 154, which when used in 
[13] gave rrxy=.770. However, be- 
cause in a few cases different papers 
having the same 7 scores but differ- 
ent X’ scores could be assigned to dif- 
ferent 7’ categories in the contin- 
gency table, the estimated value of 
rrx could be made to vary between 
.770 and .795. The correlation ob- 
tained using the original T and X 
scores without grouping was .7703. 
The variance of the original scores T 
computed without grouping was found 
to be Vr =197.815. The sum of Dix 
for the 89 items was 2259; using this 
value and rrxy=.78 in [14] gives Vr 
= 209.693, which is about 6 per cent 
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too large, but which is still close 
enough for a quickly computed ap- 
proximation. The sum of variances 
for the 89 items in T was 18.647. Us- 
ing the approximate value [14] in 
[15] gives .921 as an estimate of rrr, 
which is close enough to the correct 
value .916. 


CONCLUSIONS 


The transformation is relatively 
efficient and also appears to be pre- 
cise enough for general use. It can 
be applied in problems in which item 
counts are available for continuously 
distributed test or criterion scores. 
The statistics discussed are then 
more quickly estimated using the 
transformed values than they would 
be using the original scores. 

Use of such a transformation leaves 
open to question the effects of possi- 
ble departures from normality in the 
original test or criterion distributions. 
Judging from previous research in 
which test data have been normal- 
ized, and from examples such as the 
one in the present paper, these effects 
should not often be extreme enough 
to invalidate results obtained using 
the transformation. 
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ON THE ORIGIN AND EARLY USE OF THE TERM 
VICARIOUS TRIAL AND ERROR (VTE) 


KARL F. MUENZINGER 
University of Colorado 


In their recent article in this Jour- 
nal on “Vicarious trial and error and 
related behavior’ (2), Goss and 
Wischner say that ‘‘to this general 
pattern of behavior Muenzinger and 
Fletcher have given the name ‘vicari- 
ous trial-and-error,’ abbreviated 
‘VTE’.” The origin of this term should 
have been ascribed to Muenzinger 
and Gentry. In the article referred to 
by Goss and Wischner I say so (4, p. 
89), but it is possible that I was not 
explicit enough. 

It was Evelyn Gentry (now Evelyn 
G. Hooker) who made the first study 
of the phenomenon in an experiment 
designed for this purpose and not as 
a by-product of other experiments. 
Her results were described in 1930 in 
an M.A. thesis under my direction 
(1) in which the term vicarious trial 
and error with its abbreviation VTE 
were used, and in which reference was 
made to earlier descriptions of this 
kind of behavior by other experi- 
menters. : 

Our criterion for recording VTE in 
any one trial was then and still is ‘“‘a 
facing into one alley before the other 
one, whether right or wrong, was en- 
tered”’ (3, p. 77). 

At first my co-workers and I 
thought that the presence of dis- 
criminanda within the choice alleys 
was the necessary condition for the 
occurrence of VTE. However, this 


view almost immediately turned out 
to be wrong because we observed (in 
1929) that VTE also occurred when 
the choice alleys contained no dis- 
criminanda. In this case an animal 
had to make a left or a right turn in 
conjunction with the presence or 
absence of a tone that was sounded 1 
meter above the choice point (3). 
We realized that it was the mere pres- 
ence of the choice alleys that pro- 
duced VTE. 

We have always emphasized the 
role of experimental conditions in the 
relative frequency of VTE. To illu- 
strate, our observations throughout 
the years have invariably shown that 
as compared with no shock the pres- 
ence of electric shock after the point 
of choice is accompanied by more 
VTE (3, p. 78). We have also found 
invariably that in a difficult discrimi- 
nation situation the frequency of 
VTE is higher than in an easier one 
(3, p. 81). 

We have always assumed that a 
relation between VTE and learning 
efficiency exists. This was in line 
with the notion prevalent 30 years 
ago that actual trial and error is in- 
dispensable in certain types of learn- 
ing. But we have also stated ex- 
plicitly that ‘‘we have not demon- 
strated a causal relationship between 
the two” (3, p. 84). 


REFERENCES 


1. Gentry, E. A substitution for trial and 
error in the white rat. Unpublished 
master’s thesis, Univer. of Colorado, 
1930. 

2. Goss, A. E., & WiscHNER, G. J. Vicarious 


trial and error and related behavior. 
Psychol. Bull., 1956, 53, 35-54. 

3. MuenzinGeR, K. F. Vicarious trial and 
error at a point of choice: I. A general 
survey of its relation to learning effi- 


493 








494 KARL F. MUENZINGER 


ciency. J. genet. Psychol., 1938, 53, 75- food tension in the visual discrimination 

86. habit. J. comp. Psychol., 1936, 22, 79- 
4. MUENZINGER, K. F., & FLETCHER, F. M. 91. 

Motivation in learning: VI. Escape from 

electric shock compared with hunger- Received March 20, 1956. 


ERRATUM 


In ‘‘The Water-Jar Einstellung Test as a Measure of Rigidity,”” by Eugene 
E. Levitt, Vol. 53, No. 5, September, 1955, p. 368, right-hand column; 
For: ‘1. After eight years of research, evidence for the validity of the 
water-jar test as a measure of validity is still lacking.” 
Read: “1. After eight years of research, evidence for the validity of the 
water-jar test as a measure of rigidity is still lacking.” 











